Loss is not stable when training neural network

fantastichaha11 · September 14, 2024, 10:10am

In the practice lab of week 3, when training the complex neural network, i noticed that the loss of each epoch sometime increases. I have used Adam optimizer with learning rate of 0.01 like the hint suggested and I think it should decrease each epoch. Can someone explain me why did the loss increase, please?

Alireza_Saei · September 14, 2024, 10:39am

Hi @fantastichaha11

It’s normal for the loss to occasionally increase during training, even with the Adam optimizer. These fluctuations are part of the optimization process, and as long as the overall trend is downward, the model is still learning effectively.

As long as remember, it is ok for your loss to increase in few epochs, but the generally it must decrease! It depends on your model architecture and hyper-parameter values.

Hope it helps! Feel free to ask if you need further assitance.

fantastichaha11 · September 15, 2024, 9:28am

@Alireza_Saei Thanks for your reply! I’m kind of slightly understand. By the way, it’s seem like Dr Andrew dont mention this “fluctuations” in the specialization. Can you suggest me some documents or other courses about this?

gent.spah · September 15, 2024, 9:40am

In a certain epoch the model might see data that has not previously seen because of data shuffling and its splitting, so the loss at this epoch might jump up, but in the next epoch if it comes across that data again it won’t jump up anymore.

fantastichaha11 · September 15, 2024, 10:08am

@gent.spah I’m afraid I can’t agree with you. I thought in an epoch, the model will pass through all the dataset, so it can’t “see data that has not previously seen”.

Nevermnd · September 15, 2024, 11:25am

@fantastichaha11 remember, regarding SGD say, our solution spaces are potentially hyperplanes and massively complex.

We can’t just immediately ‘see’ the minimum-- If we could we wouldn’t even have to iterate at all: We’d just head right for the goal.

And though we have a strict cost function and its derivative to ‘guide us’, we still have to take a bit of a ‘blind leap’ each time, and there is no guarantee (can also depend on the actual distribution of the data we are looking at), that step is at least always better–

Though most of the time it is.

gent.spah · September 16, 2024, 5:44am

@Nevermnd has a good point here!

Deepti_Prasad · September 16, 2024, 6:42am

hi @fantastichaha11

After epoch as you pointed the model trains through each data point, the model contains hyperparameters as pointed by @Alireza_Saei for optimization of a model, so whenever one cycle epoch training is completed, these optimization like batch normalization, random flip is considered during the next cycle of epoch training and the model trains again through each layer and input one creates, it doesn’t start from previous epoch loss result data point but learns about it and retains from beginning where in the next cycle if the hyperparameters used in the new epoch cycle didn’t learn anything new, the loss ends up on a bit higher loss than the previous one which is completely normal until the next epoch training should have decrease trend. In case it doesnt then one has to suspect overfitting in the model depending on the parameters or optimization method used.

Same logic fits for the accuracy where there should be increase in accuracy and decrease in loss. Remember sometime a decrease in loss still comes with decrease in accuracy which is again not right for the model.

the accuracy curve records how accurate the model’s predictions are on the given data, while the loss curve records the actual difference between the model’s prediction and the actual true output.

This part of model training is actually very well explained in Deep Learning Specialisation.

Regards
DP

Topic		Replies	Views
C2_W2_Softmax assignment - behavior in model training Advanced Learning Algorithms week-module-2	2	368	September 4, 2023
Increasing loss with decreasing learning rate AI Discussions ai-discussions	15	1184	April 29, 2024
Loss values do not change AI Discussions	11	60	May 4, 2023
When I add more training valus neural network can not find a solution. Why? Introduction to TF for Artificial Intelligence ... week-module-1	16	403	December 9, 2023
Loss suddenly jumps up Convolutional Neural Networks coursera-platform	5	438	July 4, 2023

Loss is not stable when training neural network

Related topics