DLS Course 2 Week 2: adam is the worst algo

asa57 · July 21, 2023, 4:41pm

Hi all,

When practicing with optimization algorithms in week 2’s assignment, I observe that adam performs the worst among the three algorithms in Section 6.3 - Mini-Batch with Adam and its progress in cost optimization does not make much sense (see the screenshot).

Did anyone observe anything like that?
(My code in update_parameters_with_adam passed the test).

Comments/suggestions would be much appreciated.

gent.spah · July 21, 2023, 6:21pm

That learning rate is quite small, maybe if you start with a larger number and update/reduce it as epochs increase you might get better results.

paulinpaloalto · July 22, 2023, 3:51am

Are you sure you didn’t alter any of the other parts of the notebook other than the code you need to write? Here’s what I see from the Adam section:

The accuracy I get from Adam is 94%, which is greater than the 71% accuracy I see with plain minibatch or with momentum. Note that 71% is also higher than you get even from Adam.

If your functions pass the test cases in the notebook, the only thing I can theorize is that you modified some other part of the notebook to create this effect. You might want to start with a clean notebook and just “copy/paste” over your completed code and see if that makes a difference. There is a procedure for that documented on the DLS FAQ Thread.

asa57 · July 22, 2023, 11:29am

Thanks to all who responded!

There was a bug in my code that was not noticed by the update_parameters_with_adam test because the test initialized v and s with zeros. When I fixed the bug, I got the expected 94% accuracy for Adam – my original issue is resolved.

As a side note, I’ve also found that one can get the same >94% accuracy with the gradient descent by increasing the learning rate – as it turns out, the value in the notebook is too small. Moreover, it is clear that learning rates in the gradient descent method and in Adam have different scales, so a simple comparison with numerically identical (and unoptimized) learning rates does not really prove the advantage of one method over the other.

Topic		Replies	Views
Course 2: Week 2 Exercise 6.3 & 7 Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	705	September 15, 2022
Possible mistake? Optimization_methods assignment Improving Deep Neural Networks: Hyperparameter tun week-module-2 , coursera-platform	7	37	December 31, 2024
Course 2 week 2 programming assignment error Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	653	September 27, 2022
Course2 week 2 assignment Improving Deep Neural Networks: Hyperparameter tun coursera-platform	7	582	August 27, 2021
Getting 'inf' cost for Adam's implementation Week_2 Assignment Improving Deep Neural Networks: Hyperparameter tun coursera-platform	4	506	March 22, 2023

DLS Course 2 Week 2: adam is the worst algo

Related topics