What is the meaning of "reduce the number of training epochs

Hello @charindith

An “epoch” means one pass of the whole training set. If my training set has 10 samples, then running 20 epochs means my model will be trained on these 10 samples for 20 times. However, if I copy the samples 4 times so that now my training set becomes 40 samples (4 copies for each of the 10 samples, 4 x 10 = 40), then I will only need to run 5 epochs for my model to be trained on the 10 samples for the same 20 times. Therefore, making the copies can reduce the number of epochs.

One reason we preferred 5 epochs of 40 samples over 20 epochs of 10 samples was that the former can save some time. There is some overhead switching over from one epoch to the next, and thus having less epochs can save some time.

One reason we preferred 5 epochs of 40 samples over 1 epoch of 200 samples (20 copies of each of the 10 samples) was that we didn’t want to completely get rid of that overhead. That overhead includes tracking metric and/or loss performance from one epoch to the next, so that it allows us to design algorithm to early stop the training process when some performance criteria is met.

Cheers,
Raymond

PS1: The number of gradient descents performed in one epoch depends on the mini-batch size. After copying data, if the total number of samples is 40 and the mini-batch size is 2, then there will be 20 gradient deacents in one epoch.

7 Likes