UnidentifiedImageError: cannot identify image file <_io.BytesIO object at 0x7fb5462359b0

Hi I’m getting the below error while training the model. Can somebody help?

On the generator are you pointing to the right source data files, check you might have changed the source file uri.

Please ensure that image files of length 0 are not taken into consideration for the training / validation sets.

Here’s the function which should take care of this:

def split_data(SOURCE, TRAINING, TESTING, SPLIT_SIZE):

  ### START CODE HERE
  pass

  ### END CODE HERE

2 files should be left out. Here’s the expected output:

666.jpg is zero length, so ignoring.
11702.jpg is zero length, so ignoring.
There are 11250 images of cats for training
There are 11250 images of dogs for training
There are 1250 images of cats for testing
There are 1250 images of dogs for testing

I’ve solved this error, this was happening because of some corrupt images in the dataset.

In case someone else is facing the issue, below solution can be applied to check for corrupt images using Pillow library in the split_data() function.

[code removed - moderator]

@Rahul_Kapoor

The number of images before and after filtering are the same in your case. del image_referece within a for loop doesn’t remove the entry from the underlying images list (which contains all all images within the source directory).

1 Like