Take(1) returns different image each time?

Meir · April 23, 2021, 9:59am

I simplified the given code:

    for image, _ in train_dataset.take(1):
        plt.figure(figsize=(3, 3))
        first_image = image[0]
        ax = plt.subplot()
        augmented_image = tf.expand_dims(first_image, 0)
        plt.imshow(augmented_image[0] / 255)
        plt.axis('off')

There are two things I would like to understand:

Why do I get a different image each time I run this code?
Why is expand_dims needed? And, If we need this dimension, how come we take [0] on the next line?

AmmarMohanna · April 23, 2021, 10:25am

Hello Meir, thank you for your interesting question.

I will try to guide you to find the answers on your own.

Check the TF source code, and if does not help, I would love to explain more.
Expand dims returns a tensor with a length 1 axis inserted at the second index ‘axis’. If you want to know why it is needed, try to remove it, and see for yourself . And the [0] index is used because expand_dims returns a tensor with two elements.

Hope this helps, if not I would be happy to assist more.
Regards,

Meir · April 23, 2021, 10:53am

The code and the documentation suggest the opposite - that the same first elements of the dataset should be chosen each time. In fact, this is what happens when I try their example:

dataset = tf.data.Dataset.range(10)
dataset = dataset.take(3)
list(dataset.as_numpy_iterator())

Why would something different happen with the images dataset?

I am still lost. Could you please give a concrete example with concrete values?

henrikh · February 2, 2022, 2:21pm

Hi,

My understanding is that in fact train_dataset inherits the functionalities of

image_dataset_from_directory(directory,
shuffle=True,
batch_size=BATCH_SIZE,
image_size=IMG_SIZE,
validation_split=0.2,
subset=‘training’,
seed=42)

I realized that if you set shuffle = False in

train_dataset = image_dataset_from_directory(directory,
shuffle=False,
batch_size=BATCH_SIZE,
image_size=IMG_SIZE,
validation_split=0.2,
subset=‘training’,
seed=42)

then the images are always the same, no matter how many times you run the image plotting cell. If you set shuffle=True the images are different at every run of the image plotting cell but the order in which they appear is the same. I guess, this means that the 9 images come from the first batch of size 32 if shuffle=False, because the data is shuffled in alphanumeric order. In contrast, if shuffle=True then the data is shuffled at every run of train_dataset.take(1) and we get different batches every time. Nevertheless, in both cases the set of training images is the same - for shuffle=True it is shuffled and for shuffle=False it is not.

Henrikh

Topic		Replies	Views
W2A2: What does train_dataset.take(1) mean in MobileNet assignment Convolutional Neural Networks coursera-platform	3	690	August 2, 2021
Week2/Assignment2 Convolutional Neural Networks coursera-platform	3	563	September 9, 2021
`data_augmenter()` correct but does not change image to the naked eye Convolutional Neural Networks coursera-platform	3	547	May 27, 2022
Images preprosessing Convolutional Neural Networks coursera-platform	3	518	September 9, 2022
C4W2 - Why RandomFlip requires a batched image? Convolutional Neural Networks coursera-platform	2	519	September 7, 2021

Take(1) returns different image each time?

Related topics