DLS Course 2 Week 2 A1 Exercise Adam optimizer

Having an error while compiling Adam optimizer, I tried writing equation in different ways but the error is still there. It would be helpful if you could point out my mistake meanwhile I am also trying to find my error.

Hi @Shan.ali ,

Make sure you follow the formula for the calculation correctly. Also, the operator * and / have higher precedent over + and -, so make sure you are using parenthesis to force an expression to evaluate in the order you want.

2 Likes

RIght! The thing that most people have trouble with in terms of the “order of operations” that Kin points out is the \epsilon term: note that it is in the denominator, but not under the square root, right?

1 Like

Hello, i’m getting a similar 2nd - 3rd digit error in my matrices, with the justification that my vi_corrected computation is incorrect.
I use the formula shown be the Notebook but cant seem to fix it, while i’ve used parenthesis to insure proper order of operations.

Could you give any pointers on what to try next ?

Hi @Alexis_Staboultzidis ,

Please check your code follows the formula exactly and use parenthesis as mentioned in this post. Also, make sure you refresh the kernel and rerun the code if you have made changes.

Should i follow the error message and focus on V_corrected[‘dW’] formula (because that equation is pretty simple and i highly doubt my error is there …) or this is just indicative and the error maybe in another line of my code?

image

in this image epsilon is not under the sqrt, correct ?
And v_corrected and s_corrected formulas use the denominator as 1- (beta#) to the power of the respective layer, or ( 1 - beta#) and the hole parenthesis to the power of the respective layer ?

Hi @Alexis_Staboultzidis ,

As the AssertionError is pointing at v_corrected[‘dw’], it should be our focus to sort this out first. Although the formula for V_dw^{[l]} is very simple, and using parenthesis to enforce the evaluation order is important, have you checked how the len is calculated? Is dw being addressed correctly? Is hardcoded value being used?

Thank you for your pointer I hade misinterpreted the parameter l as t…:sweat_smile:

Στις Παρ 20 Ιαν 2023 στις 5:16 μμ ο χρήστης Kin Cheung via DeepLearning.AI <notifications@dlai.discoursemail.com> έγραψε: