Hello World, Considering Adam uses both Momentum and RMSProp ideas in its implementation, why not always use Adam optimizer? In what scenarios would one use Momentum or RMSProp instead of Adam?

Why not always use Adam optimizer

Course Q&A Deep Learning Specialization Structuring Machine Learning Projects

Christian_Simonis December 23, 2022, 10:04pm 4

In addition to 2)
If Adam does not converge well, AMSGrad might be worth a look, see also:
https://johnchenresearch.github.io/demon/

Here also some other algorithms are explained like QHM (Quasi-Hyperbolic Momentum) which decouples the momentum term from the current gradient when updating the weights which can also be beneficial!

Best regards
Christian

Why there are so many optimizer algorithms?

Shouldn't gradient descent follow a path perpendicular to contour lines?

Adam Optimization Question

Difference between Rmsprop and ADAM

Question about optimizers

Topic		Replies	Views
Choosing between Momentum, RMSprop and Adam in real life Improving Deep Neural Networks: Hyperparameter tun coursera-platform	4	540	November 3, 2022
Difference between Rmsprop and ADAM Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	1139	April 17, 2023
Optimization algorithms Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	721	April 8, 2023
Adam vs RMSPROP, Momentum Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	564	January 8, 2023
GD with momentum versus ADAM Improving Deep Neural Networks: Hyperparameter tun week-module-2 , coursera-platform	3	168	May 8, 2024

Why not always use Adam optimizer

Related topics