Purpose of steps per epoch

Hello, I realized that we set steps per epoch when we fit the model. However, we also set batch size when we use image data generator’s flow_from_directory method. Therefore, what is the point of using both of them? Let’s say we have 2000 images, then we set the batch size as 20. Why do we set steps per epoch as 100? Isn’t it already specified as batch because every iteration in any batch we will do 100 iterations. Right? Would you help me to make this clear? Thank you.

If you have a large dataset of say, 20_000 and the batch size is 20, you’ll have 1000 batches i.e. each epoch will consist of 1000 batches of data. If you want to limit the number of batches per epoch, use steps_per_epoch. When this is set to a value like 10, each epoch will now process only 10 batches of data.
One thing to note is that when you’re done with the 1st epoch by processing batches 1 through 10, the next epoch will start from the 11th batch in the underlying dataset.

If steps_per_epoch is not set, each epoch will process all the batches in the dataset.

1 Like