Image file Matrix notation

Hi there,

I’m having issues with understanding the image matrix. So the lectures indicate that an image file is numpy array of shape (m_train, num_px, num_py, 3) where m_train = number of samples, num_px or num_py = 64 which are the equivalent of the size of image, and 3 means red, blue and green 3 colors.

So I tried to create an example of this (see below), and try to make sense of it.

I’m getting (2,2,3,2) with the example I set up. Does that mean I have 2 images, with size of 2x3, and with only 2 colors for each image in this case?

Can someone confirm this? I think this is important to get a better understanding of the matrix system.

That’s right!
To familiarize yourself further with images you can load your own jpegs (or another format) images with Python libraries like PIL (Pillow) or directly matplotlib :

import matplotlib.image
image = matplotlib.image.imread('myImage.jpeg')
image.shape

You will output the shape for one image.
If you have an array of image, you have one more dimension (generally the first dimension or the last) corresponding to the image index

I let you browse the Numpy documentation to find by yourself the important methods on Numpy arrays to switch (transpose) the dimension positions :slight_smile:

1 Like