Choice of SGD instead of Adam


I thought Adam was kinda the state-of-the-art optimizer.
I saw RMSProp & SGD used in this course, is it for educational purposes (showing us how to play with hyperparameters) or is there a technical reason ?

Thank you in advance

Choice of optimizer is problem specific. Consider the 1st assignment in course 1, the housing price prediction problem.
With just the default parameters, if you try adam and sgd, you’ll find that sgd results in a lower loss.
Hope this helps.