Course 2: Week 2 Exercise 6.3 & 7

Hello - All my tests pass for update_parameters_with_adam, however when I run the models to compare the accuracy of mb-gd, mb-gd with momentum, and mb-gd with adam I’m not matching the expected Accuracy. Mini-batch with GD is the expected 71, Mini-batch with momentum is the expected 71% but the Model with Adam is only reaching 75percent instead of the expected 94 percent. I’ve rerun all the cells, and I’ve double-checked all the tests have passed.

Possibly interesting datapoint: for exercise 7, all tests pass, and the first 4k epochs everything looks identical but the 5k accuracy is just slightly off:
Expected Output:
Epoch Number Learning Rate Cost
0 0.100000 0.701091
1000 0.000100 0.661884
2000 0.000050 0.658620
3000 0.000033 0.656765
4000 0.000025 0.655486
5000 0.000020 0.654514
Actual Output:
Cost after epoch 0: 0.701091
learning rate after epoch 0: 0.100000
Cost after epoch 1000: 0.661884
learning rate after epoch 1000: 0.000100
Cost after epoch 2000: 0.658620
learning rate after epoch 2000: 0.000050
Cost after epoch 3000: 0.656765
learning rate after epoch 3000: 0.000033
Cost after epoch 4000: 0.655486
learning rate after epoch 4000: 0.000025
(after 5k)
Accuracy: 0.6533333333333333

Grader output shows 100/100 so no complaints grade wise, but maybe helps point to whatever issue y’all are seeing

Hi @Anomy , welcome to the DLS !

Thanks for let us notice that there is a difference between the actual results and the expected one! We will let the dev team know.

But the issue is that the exercise is not showing the cost and learning rate of the 5000 iteration.

The Accuracy is not the same as the Cost. It almost never those values will be similar, except for really small networks with a few samples like the ones used in the exercise tests.

The Cost can be any real value, from small ones to really big ones, and is only comparable with other costs from the same network and training set (like comparing the cost of one iteration with the cost of the previous one, or to compare two run of the NN with different hyper-parameters).
The Accuracy is the percentage of samples that were correctly labeled by the Neural Network and it can be used to compare the performance of different neural networks.

So the fact that they are similar here is a very uncommon coincidence.

Oh my gosh, you’re so right! How silly of me, thanks!
Good health to you and yours,

For whatever it’s worth, I’m also seeing the problem identified here: All tests for exercises 1-5 pass, the mini-batch with GD and mini-batch with momentum perform as expected, but the model with Adam is only reaching 75% rather than the expected 94% accuracy.