'sgd' optimizer

german.mesa · September 13, 2021, 8:02am

Optimisation algorithms are designed to minimise the error rate when training your model. How efficient is it can be measured based on speed of convergence - how many epochs do you need to get a global optimum - and their generalisation capabilities - how good your model react to new data.

In our extremely simple case, SGD converges quicker compared to Adam, that’s why you get lower loss. We shouldn’t get any conclusion for that as you’ll find out many cases that Adam’s will make it much faster and better than SGD. As suggestion, you can also play with learning_rate and will notice that results can change in favour of Adam.

Topic		Replies	Views
Stochastic Gradient Descent Definition Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	636	September 29, 2022
C1_W1_Lab_HousePredictions (Using Adam optimizer) Introduction to TF for Artificial Intelligence ... week-1	7	703	October 12, 2022
Choice of SGD instead of Adam Sequences, Time Series and Prediction week-4	1	549	July 10, 2022
Loss is not stable when training neural network Advanced Learning Algorithms week-3	7	91	September 16, 2024
Course2 week 2 assignment Improving Deep Neural Networks: Hyperparameter tun coursera-platform	7	582	August 27, 2021

'sgd' optimizer

Related topics