DLS C2W3 Momentum vs Adam Beta Hyperparameters

Alancaster · April 2, 2023, 3:27pm

In C2W3 video " Tuning Process" Professor Ng mentions that the momentum term is second priority for tuning but he goes on to say that he almost never tunes the beta params for ADAM optimization. This is confusing to me because ADAM optimization is just combining momentum and RMSProp. In this context isn’t, beta1 the same as the momentum hyperparameter?

arosacastillo · April 4, 2023, 9:05am

Hi Alan,

I will try to help you understanding these concepts. This is what I found for you:

beta1. The exponential decay rate for the first moment estimates (e.g. 0.9).
beta2. The exponential decay rate for the second-moment estimates (e.g. 0.999). This value should be set close to 1.0 on problems with a sparse gradient (e.g. NLP and computer vision problems).

Reference: https://machinelearningmastery.com/adam-optimization-algorithm-for-deep-learning/

And the tuning effect: optimization - Deep Learning: How does beta_1 and beta_2 in the Adam Optimizer affect it's learning? - Cross Validated

Tuning both can be tricky to achieve better results so that is why maybe it is a better strategy to tune other parameters.

Happy learning,

Rosa

Alancaster · April 11, 2023, 1:14am

Hi Rosa,

Thanks for your reply. Given your response and what was mentioned by Dr. Ng in the lecture, I will consider the above lecture slide incorrect and consider all beta params low priority for tuning.

Thanks

Topic		Replies	Views
Adam Optimiztion Improving Deep Neural Networks: Hyperparameter tun	4	614	May 6, 2021
Choosing between Momentum, RMSprop and Adam in real life Improving Deep Neural Networks: Hyperparameter tun	4	539	November 3, 2022
Momentum clarification Improving Deep Neural Networks: Hyperparameter tun	2	529	July 23, 2021
Course2 week 2 assignment Improving Deep Neural Networks: Hyperparameter tun	7	582	August 27, 2021
Sampling hyperparameter for momentum Improving Deep Neural Networks: Hyperparameter tun quiz-help , week-3	8	291	February 2, 2025

DLS C2W3 Momentum vs Adam Beta Hyperparameters

Related topics