C2W3 Lab assignment: data dimensions in model

After reading the function model() in Lab Assignment 3.3, a few questions came up:

Let C = number of features, m = number of samples

  1. In previous sections, we implemented two functions: forward_propagation() and compute_total_loss(). In both of these functions, the shape of inputs are both (C, m). In function model(), the inputs X_train and Y_train are also provided in the shape of (C, m). However, when forward_propagation() and compute_total_loss() are called from model(), the transposed versions of X_train and Y_train are passed as arguments. Calling tr.transpose(X_train) and tr.transpose(Y_train) will change their shapes to (m, C), which is not what forward_propagation() and compute_total_loss() expect.

  2. I know that X_train and Y_train are fed to tf.data.Dataset and used to create mini-batches. Does tf.data.Dataset or tf.data.Dataset.batch somehow change the shape of X_train and Y_train?

  3. I also noticed that X_test and Y_test are turned into a tf.data.Dataset object and mini-batches are created from this. What are the reasons behind splitting the test (I assume test here means Validation)? The test mini-batches are used in predictions every 10 training epochs. What is the disadvantage of treating the test/validation set one batch?


Here’s a thread which explains how the dimension orientation is handled in this assignment.

The tf.data.Dataset class does not change the orientation of the input data: it requires that the input have the samples dimension as the first dimension and then it subdivides along that dimension. The point about TF assuming samples are the first dimension is also covered in the thread I linked above.

It is generally a good idea to add the logic to process data in minibatches. The Dataset class in TF is commonly used for that. If you’re going to do that at all, then you do it for all your input datasets, because that’s the way you wrote the code.