Epoch clarification

Basira_Daqiq · July 17, 2023, 3:05am

I want to clarify my understanding of the point behind the number of epochs. So each epoch means the model has seen all the training data during training. During each epoch, gradient descent runs and finds the most appropriate coefficients that lead to the lowest loss. So what changes from one epoch run to the next?

rmwkwok · July 17, 2023, 3:44am

Hello @Basira_Daqiq,

We go through the whole training set once in each epoch. In each epoch, the training set is further divided into a preconfigured number of mini-batches, and one gradient descent is performed over one mini-batch. Each gradient descent attempts to change the weights TOWARDS an optimal solution, but no single one of gradient descent guarantee it to reach the optimum. We need to distinguish between “changing towards” and “reach”.

The problem here is that the gradient descent does NOT reach the lowest loss. It only changes the weights TOWARDS the lowest loss. Therefore, one gradient descent doesn’t guarantee it, and one epoch doesn’t guarantee it either. If one epoch is not sufficient, we need another epoch. That’s why we want more than one epochs.

Cheers,
Raymond

Basira_Daqiq · August 5, 2023, 11:07pm

thank you!

Topic		Replies	Views
[Mini-batch gradient descent] Did Andrew mean "epoch" instead of "iteration"? Improving Deep Neural Networks: Hyperparameter tun	4	620	July 7, 2021
What is the meaning of "reduce the number of training epochs Advanced Learning Algorithms week-1	11	1113	September 26, 2023
Gradient Descent [Logistic Regression] Neural Networks and Deep Learning	3	407	August 17, 2023
Confusion epoch iteration Improving Deep Neural Networks: Hyperparameter tun	2	559	September 13, 2021
Difference between epoch and iteration Improving Deep Neural Networks: Hyperparameter tun	4	570	March 9, 2022

Epoch clarification

Related topics