Choosing between Momentum, RMSprop and Adam in real life

Doron_Modan · November 2, 2022, 2:26pm

As I understand, Adam is a combination between Momentum and RMSprop. Andrew said that Adam is successfully used for a large variety of problems. My question is: Does this mean that in real life we will usually opt for Adam? (So that Momentum / RMSprop were only taught in order to help us inderstand the exponentially weighted moving average principle?)
Or, do we need to sometimes train with Momentum/ RMSprop? In which case, is there any recipe as to which optimization we want to use, or is it better to just try them all?

Mubsi · November 2, 2022, 3:21pm

Hi @Doron_Modan,

I’m not sure in which week of C2 you are at currently, but there will be assignment where you’ll go through all of these and compare results.

Best,
Mubsi

Doron_Modan · November 2, 2022, 8:41pm

I’m not sure this answers my question. I was asking about real life.

paulinpaloalto · November 2, 2022, 10:55pm

I’m not an actual practioner of ML/DL in an industrial setting, so I’m not really qualified to answer about “real life”. But I think your theory is right: in most of the examples we see in the rest of these courses, “Adam” is the preferred optimization method. But the other overall message about hyperparameter choices here in Course 2 is that there is not really a single “silver bullet” answer that always works best in all cases for most of the choices you have to make. So it never hurts to have more tools in your toolbox. You start with Adam and if that doesn’t work well enough, you consider the others.

Mubsi · November 3, 2022, 8:28am

Hi @Doron_Modan,

My apologise for not expanding on what I had said above, but Paul basically mentioned it already (Thanks Paul!).

What I meant was that you’ll do some comparisons and see what works best. Similarly, in real world, given the scenarios, you can train different models and see which ones satisfies your needs. Adam seems to be a popular choice no doubt, but others might outperform it given the task at hand.

Best,
Mubsi

Topic		Replies	Views
Difference between Rmsprop and ADAM Improving Deep Neural Networks: Hyperparameter tun	1	1097	April 17, 2023
Why not always use Adam optimizer Structuring Machine Learning Projects	3	2069	December 23, 2022
Optimization algorithms Improving Deep Neural Networks: Hyperparameter tun	2	719	April 8, 2023
GD with momentum versus ADAM Improving Deep Neural Networks: Hyperparameter tun week-2	3	168	May 8, 2024
Choice of SGD instead of Adam Sequences, Time Series and Prediction week-4	1	548	July 10, 2022

Choosing between Momentum, RMSprop and Adam in real life

Related topics