Why create two generators instead of one?

SailSabnis · July 3, 2024, 10:49am

Any benefit of creating two separate image generators for rescaling here?

Instead of -

All images will be rescaled by 1./255.

train_datagen = ImageDataGenerator( rescale = 1.0/255. )
test_datagen = ImageDataGenerator( rescale = 1.0/255. )

--------------------

Flow training images in batches of 20 using train_datagen generator

--------------------

train_generator = train_datagen.flow_from_directory(train_dir,
batch_size=20,
class_mode=‘binary’,
target_size=(150, 150))

--------------------

Flow validation images in batches of 20 using test_datagen generator

--------------------

validation_generator = test_datagen.flow_from_directory(validation_dir,
batch_size=20,
class_mode = ‘binary’,
target_size = (150, 150))

Can we do this -

img_gen = ImageDataGenerator(rescale=1/255)

train_gen = img_gen.flow_from_directory(‘/Users/sailsabnis/Downloads/coursera/cats_and_dogs_filtered/train’,
batch_size=20,
class_mode=‘binary’,
target_size=(150,150))
validation_gen = img_gen.flow_from_directory(‘/Users/sailsabnis/Downloads/coursera/cats_and_dogs_filtered/validation/’,
batch_size=20,
class_mode=‘binary’,
target_size=(150,150))

gent.spah · July 3, 2024, 11:26am

Did you try, could you do it. If not perhaps there is something left in the memory for each generator that does allow its reuse as fresh component!

SailSabnis · July 3, 2024, 11:50am

Hi there!

Yes, I did try and it works smoothly. No need to create two datagens (test and train). we can simply create 1 datagen say ‘img_gen’ and use it for both train and test generators.

Although, my question was why did Laurence chose to have it as 2 separate dategens. Any additional benefit we get out of that down the road?

gent.spah · July 3, 2024, 11:53am

No I dont think there is an additional benefit rather then making it more clear for the learner!

paulinpaloalto · July 3, 2024, 2:31pm

It may work in this particular instance, but it could also be subtly wrong. Maybe you end up doing training on the validation data. How could you tell if that is what is happening? We’re in the deep end of the swimming pool here doing sophisticated Object Oriented Programming. To get the correct results, you need to be very clear how the underlying TF classes are defined. When you get an instance of the ImageDataGenerator class, then you invoke the flow_from_directory() method of that class on that instantiated instance, does it generate yet another separate object or does it just invoke one existing object both times? You need to be sure that is true in order for your version to work if your usage of those objects will be interleaved. I have not taken these deeper TF courses, just learned TF through DLS C2, C4 and C5, so I have not encountered that class definition before. Maybe Laurence is just trying to set a good “style” example so that you get in the habit of writing code that is more likely to work in the general case and to protect you from harm in the case that you have not studied the definitions of the classes you are using with sufficient care. Or maybe he really knows that this is necessary in this particular case and your code is subtly wrong, but you don’t have a way to tell.

SailSabnis · July 3, 2024, 3:20pm

Thanks Paulin for taking the time out.

I don’t think we end up doing training on the validation data.

All it does is initialise one single instance (img_gen) of ImageDataGenerator which is normalising whatever is fed to it between 0 and 1.

Then,
train_gen = img_gen.flow_from_directory(‘train_data_path’…)
&
validation_gen= img_gen.flow_from_directory(‘validation_data_path’…)

both these above lines of code will dictate which data we are feeding (train vs validation)

I agree it is more of a best practise to write clean code. Thank you!

Topic		Replies	Views
Not getting ImageDataGenerator Convolutional Neural Networks in TensorFlow week-1	2	490	January 3, 2023
Train Generator and Imbalanced Dataset Convolutional Neural Networks in TensorFlow week-1	3	577	January 19, 2023
Why do we need to specify the batch size for validation generator Introduction to TF for Artificial Intelligence ... week-4	1	551	January 3, 2022
Train and validator generator. Always needed? Introduction to TF for Artificial Intelligence ... week-4	2	584	March 23, 2022
How could I improve the output (accuracy and validation_accuracy) Convolutional Neural Networks in TensorFlow week-2	4	570	June 23, 2022

Why create two generators instead of one?

All images will be rescaled by 1./255.

--------------------

Flow training images in batches of 20 using train_datagen generator

--------------------

--------------------

Flow validation images in batches of 20 using test_datagen generator

--------------------

Related topics