Week 2 exercise 6 incorrect values

Unfortunately I can’t seem to debug where the issue is (after having reviewed several times I’m out of ideas) I have in exercise 6 where the values are mismatched.

my code:

{moderator edit - solution code removed}

output
W1 =
[[ 1.64767045 -0.65461531 -0.56227033]
[-1.05420115 0.84734229 -2.32427806]]
W2 =
[[ 0.34852955 -0.17350776]
[ 1.48405247 -1.99814328]
[-0.35948341 -0.3548763 ]]
b1 =
[[ 1.70596884]
[-0.78396397]]
b2 =
[[ 1.18098056]
[-1.07053725]
[-0.14824754]]

AssertionError Traceback (most recent call last)
in
7 print(f"b2 = \n{parameters[‘b2’]}")
8
----> 9 update_parameters_with_adam_test(update_parameters_with_adam)

~/work/release/W2A1/public_tests.py in update_parameters_with_adam_test(target)
280 assert type(parameters[key]) == np.ndarray, f"Wrong type for parameters[‘{key}’]. Expected np.ndarray"
281 assert parameters[key].shape == parametersi[key].shape, f"Wrong shape for parameters[‘{key}’]. The update must keep the dimensions of parameters inputs"
→ 282 assert np.allclose(parameters[key][0], expected_parameters[key]), f"Wrong values. Check you formulas for parameters[‘{key}’]"
283 #print(f"{key}: \n {str(parameters[key])}")
284

AssertionError: Wrong values. Check you formulas for parameters[‘W1’]

Expected values:
W1 =
[[ 1.63937725 -0.62327448 -0.54308727]
[-1.0578897 0.85032154 -2.31657668]]
W2 =
[[ 0.33400549 -0.23563857]
[ 1.47715417 -2.04561842]
[-0.33729882 -0.36908457]]
b1 =
[[ 1.72995096]
[-0.7762447 ]]
b2 =
[[ 1.14852557]
[-1.08492339]
[-0.15740527]]

Hi @yunti !

Just a friendly reminder that posting entire code snippets might not be recommended in the community. Regarding your function, the main issue is that it omits the square root of the corrected second moment estimate when updating the parameters. Including this step is crucial for the Adam optimization algorithm to work properly.

In the Adam optimizer, the parameter update rule is:
image

  • v_corrected is the bias-corrected first moment estimate.
  • s_corrected is the bias-corrected second moment estimate.
  • Calculating the square root of s_corrected is essential because s_corrected represents the variance (squared gradients), and taking the square root brings it back to the scale of the gradients.

Feel free to reach out if you encounter any further issues. Happy learning!

3 Likes

Yes, there is no square root taken in your implementation. One other point to be careful about is to note that the \epsilon value is in the denominator, but not under the square root.

2 Likes

Thanks for your replies, that fixed it (and thanks for mentioning the details about epsilon being outside the square root too). Sorry for posting the code, I wasn’t sure of a better way for how to resolve it.
Thanks for your help

1 Like