The instructions for question 1 are written:
" Exercise 1
Find the values for:
- m_train (number of training examples)
- m_test (number of test examples)
- num_px (= height = width of a training image)
Remember that train_set_x_orig is a numpy-array of shape (m_train, num_px, num_px, 3). For instance, you can access m_train by writing train_set_x_orig.shape[0]."
I am having trouble understanding Exercise 1 on the programming assignment
Logistic_Regression_with_a_Neural_Network_mindset
When calculating num_px I am not sure if we are supposed to be using train_set_x_orig or test_set_x_orig since the height and width of the images are all equal for all of the images. .shape[1] would get the height. Does it matter if we use train_set_x_orig or test_set_x_orig? Also since the number of pixels for the length and height of eavch image are both the same value (num_px), I am not sure whether to use train_set_x_orig.shape[1] or train_set_x_orig.shape[2] since both the length and height of the matricies are the same.
Exercise 2 Questions:
The instructions say "Reshape the training and test data sets so that images of size (num_px, num_px, 3) are flattened into single vectors of shape (num_px ∗
num_px ∗
3, 1).
A trick when you want to flatten a matrix X of shape (a,b,c,d) to a matrix X_flatten of shape (b ∗
c ∗
d, a) is to use:
X_flatten = X.reshape(X.shape[0], -1).T # X.T is the transpose of X"
Ive read the instructions but I am still having a hard time understanding what this means.
I apologize if my questions are not clear. Its been a while since I have taken linear algebra, but I do know it.
Exercise 1:
‘m_train’ comes from the training set.
"m_test’ comes from the test set.
‘num_px’ could come from either, because the training and test sets must be drawn from the same distribution - so their examples will both have the same number of features.
For all that text in the Exercise 2 instructions, it boils down to this (the syntax is pretty obscure, so it’s better to see it and then assess how it works):
Hi @reinhardt_scott,
In regard to your question 1, I am referencing the instruction below:
2 - Overview of the Problem set
Problem Statement: You are given a dataset (“data.h5”) containing:
- a training set of m_train images labeled as cat (y=1) or non-cat (y=0)
- a test set of m_test images labeled as cat or non-cat
- each image is of shape (num_px, num_px, 3) where 3 is for the 3 channels (RGB). Thus, each image is square (height = num_px) and (width = num_px).
Here, num_px is number of pixels in a image. As each image has the same number of pixel for the width and height, so it is a square image. In this exercise, both the training set and test set use images of the same size,
whether num-px is taken from the training set or test set would still give you the correct answer. However, it would be more appropriate to get num_px from the training set. Similarly, it doesn’t matter if you get the num_px from train_set_x_orig.shape[1] or train_set_x_orig.shape[2].
For your second question, what it means is that you need to change the array representation of the image from an array of size (num_px, num_px,3) into a single vector of size (num_px * num_px, 1), here the ‘*’ is the multiplication operator.
Kin has given you the link to explain the role of the -1 in that “reshape” command. If you are also curious about why there is a transpose involved, here’s another historical thread that explains that in detail.