C2W1 Assignment why generator only find 22498 images when in the previous step copied 25000 in total?


image

1 Like

There are 2 Thumbs.db files in the zip file:

$ find /tmp/PetImages -name Thumbs.db
./Dog/Thumbs.db
./Cat/Thumbs.db

As a result of this, you’ll see 12501 images of dogs and cats in the cell that lists contents of /tmp/PetImages when in reality there are 12500 images.
To see how numbers match up, you can clean up your data by removing these files or check for .jpg file extension before checking for file length. Keep in mind that ImageDataGenerator will consider only image files.

Assuming that you don’t have Thumbs.db files inside /tmp/cats-v-dogs, you can expect the output of listing your train / test images to change as follows:

666.jpg is zero length, so ignoring.
11702.jpg is zero length, so ignoring.


There are 11249 images of cats for training
There are 11249 images of dogs for training
There are 1250 images of cats for testing
There are 1250 images of dogs for testing

The cell that gets the generators will now make more sense:

Found 22498 images belonging to 2 classes.
Found 2500 images belonging to 2 classes.
2 Likes

I have succesfully solved the problem before finding this post but thank for sharing. It confirm my hypothesis and i’m glad