I am slightly confused by some TF concepts, and to understand dataset shapes.
Why are the datasets this weird generator object rather than a constant of shape (nx, m) ?
When I have a dataset (e.g. x_train): x_train.element_spec only shows the shape of one example.
How can I see how may examples are contained? With NumPy the number of examples was contained in the shape.
In W3 exercise 6 - instructions say: It’s important to note that the “y_pred” and “y_true” inputs of tf.keras.losses.categorical_crossentropy are expected to be of shape (number of examples, num_classes).
How could I have deduced the expected shapes from the TF documentation?
In W3 section 6 (training the model) why is a transpose of minibatch_X passed? forward_propagation(tf.transpose(minibatch_X), parameters) my understanding is that X_train is of shape (input size = 12288, number of training examples = 1080) and that forward_propagation() takes input parameter X of shape (input size, number of examples)
What “weird generator object” are you referring to ?
A TensorFlow model doesn’t care how many examples there are, that doesn’t impact the model design. You don’t have a size until you fit the model to a specific data set.
Try x_train.shape perhaps.
The TensorFlow documentation assumes you are already are an expert. Sometimes experience is the best teacher. Tip: Often you have to look up the properties of the parent object, when one exists.
Sometimes you have to transpose things in TensorFlow simply to avoid getting error messages about shapes. There aren’t a lot of standards about what the defaults shapes should be.
Other mentors will probably have more technically-oriented explanations. I’m the practical voice.
For your questions 2, 3, and 4, they will become clear after some works and works are something I can suggest for you.
Generator-type allows you to feed data to the training process without having to preload everything into memory which is obviously necessary when your data size is larger than memory size. This is a course and it is something we need to learn and be able to use.
Check out the cardinality() method in the tensorflow’s Dataset documentation and study it for how it works. For other approaches, search on stackoverflow for also discussions of their pros and cons.
Check out the axis parameter for its explanation in the documentation, think about it and experiment with it.
Take a minibatch_X out, print its shape, and determine why a transpose operation was needed.
Prof Ng’s advice is a prudent advice, explicitly specifying what the array should be.
When a shape component is -1, there is a special meaning to it. Here is quoted from Tensorflow.org:
If one component of shape is the special value -1, the size of that dimension is computed so that the total size remains constant. In particular, a shape of [-1] flattens into 1-D. At most one component of shape can be -1.