Incorrect Instruction for Ex 5 of Python_Basics_with_Numpy

In Exercise 5 of Basic with Numpy:

The instruction stated :
“This is a 3 by 3 by 2 array, typically images will be (num_px_x, num_px_y,3) where 3 represents the RGB values”

This is incorrect. The input tensor is not of shape 3x3x2. It is 3x2X3

Please correct this. It is confusing.

I added this cell:

print(f"t_image.shape = {t_image.shape}")

And here’s what I see when I run it:

t_image.shape = (3, 3, 2)

The point is that it’s just a question of correctly interpreting how the “print” of the 3D array works:

It shows 3 arrays by enumerating the first dimension. Then each of the 3 arrays is a 3 x 2 matrix, meaning that the whole array is 3 x 3 x 2.

Thx for the clarification.

I was following linear algebra conventions of (3,2) x 3 for tensor.

I will keep it in mind python is different from linear algebra conventions.

Appreciate the quick reply.

It’s a good point that python is python and frequently has its own way of doing things that may or may not agree with your preconceived notions from math. Another perhaps more “in your face” example of this is zero-based indexing. Or the fact that log means natural log, not log base 10. I could go on :laughing:

To go one level deeper in the case at hand, here’s a way to get a better view of what is happening:

# routine to generate a telltale 3D array to play with
def testarray(shape):
    (d1,d2,d3) = shape
    A = np.zeros(shape)

    for ii1 in range(d1):
        for ii2 in range(d2):
            for ii3 in range(d3):
                A[ii1,ii2,ii3] = ii1 * 100 + ii2 * 10 + ii3

    return A

That will create a 3D array in which the value in each position is the index values in order. In other words A[1,2,3] = 123.

Let’s run that as follows:

A = testarray((3,3,2))
print("A.shape = " + str(A.shape))
A.shape = (3, 3, 2)
[[[  0.   1.]
  [ 10.  11.]
  [ 20.  21.]]

 [[100. 101.]
  [110. 111.]
  [120. 121.]]

 [[200. 201.]
  [210. 211.]
  [220. 221.]]]

So you can see in the first 3 x 2 matrix, every entry has first index 0. In the second case, the first index is 1 and so forth. The other key “tell” is to check the matching of the brackets.

Thank you for the code. It’s very informative.

I hope you won’t mind if I have 1 last question, within the Ex 5, the instruction stated:

"image – numpy array of shape (length, height, depth) "

But using shape on the given matrix, we get (depth, row count, column count).

I reread the .shape() syntax explanation and it is true the the python tensor notation is exactly what you share (depth, row, column).

I don’t like to belabor minor points, but the instructions given is following a different tensor notation convention that python’s tensor notation.

I would suggest the instruction be realigned to follow the python notation convention in order to avoid confusion for the learners who are new to python.

Am I incorrect in this ?

I think you are applying context that is not relevant here and are just reading too much into the description. There is no “convention” for what the different dimensions of a tensor are in python. It is all determined by your definition of the data. They are getting us ready to deal with RGB images, which will be a common type of input data that we need to handle. The standard way to represent an image is with three dimensions in the following order:

height x width x colors

Where height and width are numbers of pixels and colors are typically either 1 (for B/W or greyscale images) or 3 for color images (RGB).

Now if you want to get persnickety here, note that they are reversing the height and width definitions. But that’s their prerogative.

Also note that the other “standard” representation for images is what people call “channels first” orientation, which would be:

colors x height x width

But Prof Ng will consistently use “channels last” orientation throughout all the courses that I’ve seen here.

Thank you. I appreciate your detailed response and it is very helpful.