We expected 1350 and you have 1500

I​ am having the same issue as many other people. Has anyone figured out a solution? This is so frustrating.

e​rror: You have a different number of files in “/tmp/cats-v-dogs/training/cats/” directory than expected. We expected 1350 and you have 1500.

print(len(os.listdir(’/tmp/cats-v-dogs/training/cats/’)))
print(len(os.listdir(’/tmp/cats-v-dogs/training/dogs/’)))
print(len(os.listdir(’/tmp/cats-v-dogs/testing/cats/’)))
print(len(os.listdir(’/tmp/cats-v-dogs/testing/dogs/’)))

M​y output is:

1350
1350
150
150

My output looks correct. So I am not sure what the problem can be.

S​olution:

M​ake directories was correct

Use os.mkdir to create your directories

You will need a directory for cats-v-dogs, and subdirectories for training

and testing. These in turn will need subdirectories for ‘cats’ and ‘dogs’

try:
#YOUR CODE GOES HERE
os.mkdir("/tmp/cats-v-dogs/")
os.mkdir("/tmp/cats-v-dogs/training/")
os.mkdir("/tmp/cats-v-dogs/testing/")

os.mkdir("/tmp/cats-v-dogs/training/cats/")
os.mkdir("/tmp/cats-v-dogs/training/dogs/")

os.mkdir("/tmp/cats-v-dogs/testing/cats/")
os.mkdir("/tmp/cats-v-dogs/testing/dogs/")

except OSError:
print(‘Error’)
pass

T​he issue is due to done zero size files you need to filter out.

def split_data(SOURCE, TRAINING, TESTING, SPLIT_SIZE):
# use this code to filter out zero length files.
all_files =

for file_name in os.listdir(SOURCE):
    file_path = SOURCE + file_name

    if os.path.getsize(file_path):
        all_files.append(file_name)
    else:
        print('{} is zero length, so ignoring'.format(file_name))
        

# YOUR CODE STARTS HERE
#image_names = os.listdir(SOURCE)
#n_images = os.path.getsize(SOURCE)
image_names = all_files
n_images = len(all_files)

random_image_shuffle = random.sample(image_names, len(image_names))

train_len = round(n_images*SPLIT_SIZE)
test_len = round(n_images*(1-SPLIT_SIZE))

train_image = random_image_shuffle[0:train_len]
test_image  = random_image_shuffle[-test_len:n_images]
print(len(train_image))
print(len(test_image))

for image in train_image: 
    copyfile(SOURCE + image, TRAINING + image)

for image in test_image:
    copyfile(SOURCE + image, TESTING + image)
# YOUR CODE ENDS HERE

CAT_SOURCE_DIR = “/tmp/PetImages/Cat/”
TRAINING_CATS_DIR = “/tmp/cats-v-dogs/training/cats/”
TESTING_CATS_DIR = “/tmp/cats-v-dogs/testing/cats/”

DOG_SOURCE_DIR = “/tmp/PetImages/Dog/”
TRAINING_DOGS_DIR = “/tmp/cats-v-dogs/training/dogs/”
TESTING_DOGS_DIR = “/tmp/cats-v-dogs/testing/dogs/”

split_size = .9
split_data(CAT_SOURCE_DIR, TRAINING_CATS_DIR, TESTING_CATS_DIR, split_size)
split_data(DOG_SOURCE_DIR, TRAINING_DOGS_DIR, TESTING_DOGS_DIR, split_size)

I​ hope this help. Everyone else that was stuck

2 Likes