Cat V NotCat dataset Structure

Narayan · December 24, 2022, 5:24am

I’m trying to understand the structure of the dataset used in the assignment of week 2. And these are the basic Assumptions I’ve made (Please correct me if i’m wrong):

The shape of the training set is (209,64,64,3) . This means 209 images of dimensions 64x64 with 3 color channels.
Shape and size are synonymous and are used interchangeably throughout the course.

Here’s my problem though:
What I have understood about this dataset structure is its [image1,Image2…image209].
Basically 209 images inside a list.
But when i do the following:

t= h5py.File("datasets/train_catvnoncat.h5","r")
train_set_x_orig= t["train_set_x"]
for i in train_set_x_orig:
         plt.imshow(i)

I’m Getting only the first image of the dataset plotted . Is this an issue with my code or my understanding of the dataset? Also please correct me if any of the assumptions I’ve listed above are wrong. Thanks!

Elemento · December 24, 2022, 6:02am

Hey @Narayan,
Your first statement is correct. As for your second statement, that may be true at times and may not be true at other times.

For instance, when we are talking about a single image (considering it as a picture), we may use size more often, but when we are talking about a batch of images (considering each image as a matrix of values), we may use shape more often. But yes, I do believe that they are used interchangeably throughout the course, most of the times.

As for your code, you are missing out on creating a separate figure for each image, and that’s why you are getting a single image only. Use the following code, and you will get all the images

for i in train_set_x_orig:
    plt.figure()
    plt.imshow(i)

Since, you are not creating a figure explicitly, hence, the matplotlib plots each of the images in the same figure, and ultimately, you only get to see the last image. I hope this helps.

Cheers,
Elemento

paulinpaloalto · December 24, 2022, 4:16pm

Elemento has covered all the important points, but just one minor addition:

That is not a list: it is a numpy array that happens to have 4 separate dimensions. The first dimension is the “samples” dimension. When you use a for loop with a single index to index a 4D array, it indexes the first dimension, which is what you want for your purposes here.

Note that we have to do some rearrangement of that data in order to use it as the input to the type of algorithm we are using here: a Feed Forward Neural Network needs the inputs as individual vectors and cannot handle 3D (for one sample) or 4D (for a batch of samples) arrays as inputs. They give us the logic to “flatten” the 4D array into a 2D array (a.k.a. a “matrix”) in which the “samples” dimension is defined to be the second dimension, in other words the columns. Here’s a thread which discusses the flattening process in some detail.

Topic		Replies	Views
How to load an image dataset in python for an image classification problem? shape issue AI Discussions	3	97	June 6, 2022
W2 Assignment 2 Structure of train_dataset Convolutional Neural Networks coursera-platform	2	515	June 2, 2022
Programming Assignment 1, the shapes of training dataset and test dataset Improving Deep Neural Networks: Hyperparameter tun coursera-platform	4	486	May 22, 2023
Week 2 Data Set data.h5 Neural Networks and Deep Learning coursera-platform	8	551	February 20, 2024
Doubt Regarding Week 2 assignment Neural Networks and Deep Learning coursera-platform	6	602	September 1, 2022

Cat V NotCat dataset Structure

Related topics