[Week 1] - Programming assignment 1, numpy array explanation

I’m currently taking a course on CNN on Coursera and in one of the programming assignments we have the following

Z= np.zeros((m,n_H,n_W,n_C)) which supposedly n_C would be the number of channels.

Now, the output I’m expecting from this is to be an array of m training examples with n_C slices (3 channels for an image for example) each channel composed of a 2D array of height n_H and width n_W. However, Z does not output that (it outputs 5 slices/2D arrays, each with dimensins 5x3). We see that n_C is last on the np.zeros method so in a 5x5 image with 3 channels and following the course’s logic we’d have this in order:

Z = np.zeros((5,5,3))

However, I would expect it to be more like:

Z = np.zeros((3,5,5)) since this is the piece of code that results in 3 (channels), 2D arrays of 5x5

Why should we have the order m,n_H,n_W,n_C if we’re trying to depict an image as a 3D array? Shouldn’t it be m,n_C,n_H,n_W to get m training examples with n_C channels of dimensions n_H x n_W?

Thank you

There are 2 ways to use the channel parameter.

  1. It’s common in tensorflow code to use channels as the last dimension (aka channels last).
  2. Channels first convention is seen widely on pytorch code.
1 Like

Thank you, because with numpy I’m used to seeing channels first so I guess PyTorch is more pythonic in that sense