(i)Batch_size vs (ii)steps_per_epoch vs (iii)validation_steps

In C2W1_Lab_1_cats_vs_dogs.ipynb,
I saw two code snippets:
################# Code 1 ###################

train_generator = train_datagen.flow_from_directory(train_dir,
batch_size=20,
class_mode=‘binary’,
target_size=(150,150))
################ Code 2 #######################

history = model.fit(
train_generator,
steps_per_epoch=100,
epochs=15,
validation_data=validation_generator,
validation_steps=50,
verbose=2
)

#############################################
The train_generator are batched.
I think this setting automatically defined number of step in a epoch ~= (num of training data / batch_size)

However, I am confused we can also decide steps_per_epoch and validation_steps in model.fit when I see the 2nd snippets. I know I must have some misunderstanding.

Can anyone tell me how to determine a suitable steps_per_epoch, and validation_steps?
Their ratio is 2:1 but the size of training and validation sets are not 2:1

A criteria that influences steps_per_epoch and validation_steps is training time / budget.

For this assignment, setting these 2 parameters don’t matter since there are 2000 training images which is the same as steps_per_epoch * batch_size and 1000 validation images which is the same as validation_steps * batch_size.

balaji, I like your answer very much! It helps!

I checked the data provided again. Sorry for the wrong info. This is exactly 2000, and 1000 validation data.

I am glad it solved your query sir

Does it have the same effect to specify ‘steps_per_epoch’ or ‘batch_size’?
can both be specified in the model at the same time? And if this is the case do they need to be consistent to each other (e.g. number of images = steps_per_epoch * batch_size) ?

What do you understand from this ?

First, thanks for pointing to the relevant section of this exhaustive document. It might seem obvious if you are familiar with it but it’s not easy to navigate if you are not.

I did not find the answer to my question in the definition of batch_size or steps_per_epoch.
The definition of steps_per_epoch states that “the default None is equal to the number of samples in your dataset divided by the batch size”

I did find a hint in the definition of epochs: “An epoch is an iteration over the entire x and y data provided (unless the steps_per_epoch flag is set to something other than None)”

What I am still unclear about is the following scenario:
lets say I have a 10,000 images dataset
I specify batch_size = 20, steps_per_epoch = 100 and epochs = 10

Since dataset / batch_size = 500 and model.fit will use only 100 of those batches per epoch (steps_per_epoch = 100), does that mean that the other 400 batches will never be used? Or that all batches will be used (e.g. the second epoch starting with the first batch that was not used by the first epoch) resulting in all batches being used only twice in this fitting?

Iteration is circular across the dataset.
1st epoch will train on the first 100 batches, i.e. [0-99]. 2nd epoch will train on batches starting from 100. Once end of the dataset is reached, iteration will start from batch 0.

There’s one case to consider, which is batches of different sizes. Consider the following setup:

  1. Dataset has 10 points (we’ll call them [0, 1, 2, …, 9])
  2. Steps per epoch = 3
  3. Batch size = 3
  4. Epochs = 2

The 1st epoch will consist of 3 batches: [0-2], [3-5], [6-8]
The 2nd epoch will consist of 3 batches: [9], [0-2], [3-5]

1 Like

This is very helpful. Thanks!