Hi,
I thought Adam was kinda the state-of-the-art optimizer.
I saw RMSProp & SGD used in this course, is it for educational purposes (showing us how to play with hyperparameters) or is there a technical reason ?
Thank you in advance
Hi,
I thought Adam was kinda the state-of-the-art optimizer.
I saw RMSProp & SGD used in this course, is it for educational purposes (showing us how to play with hyperparameters) or is there a technical reason ?
Thank you in advance
Choice of optimizer is problem specific. Consider the 1st assignment in course 1, the housing price prediction problem.
With just the default parameters, if you try adam
and sgd
, you’ll find that sgd
results in a lower loss.
Hope this helps.