Hi learners. I am doing the last programming assignment for week 2, which is logistic regression with a neural network mindset. However, I do not understand sth with exercise 1. The training set is a 4D array (209, 64,64, 3), in which 209 is the number of examples, 64 is the length and height of images and 3 is the depth. I don’t get why 3 is the last dimension. I mean shouldn’t the shape be (209, 3, 64, 64)?

Hi,

As per terminology, dimension are written in (No. of Example, height, width, no of channels)

Yes, it is essentially arbitrary whether you put the “channel” dimension before or after the height and width dimensions, but the way it is done here is to put the channels last. Of course there are 3 channels to represent the color values of the pixel: Red, Green and Blue. Each of those is an 8 bit unsigned integer, which means the values are 0 to 255.

Here’s a thread which explains how we “unroll” or “flatten” those images into vectors that can be processed by the Logistic Regression algorithm.

Thanks for your reply. But for example, if we create some zero matrixes in python with np.zeros ((209, 3, 64, 64)) and np.zeros(209, 64, 64, 3), we would get different results. The first one gives a 209 set of 3 matrixes with 64 rows and 64 columns, but the second one gives a 209 set of 64 matrixes with 64 rows and 3 columns. Am I right? How would it be possible to include the information of 209 images with a width and height of 64 to the second structure?

Oh, I got it after seeing that thread. Thanks a lot.