W2 Exercise 5 Update parameters with Adam

vdW[l]= beta1 vdW[l] + (1- beta1) dJ/dW[l]
sdW[l]= beta2 sdW[l] + (1- beta2) (dJ / dW[l])**2
In these 2 formulas dJ / dW[l] where dJ should be the cost?

But in the function update_parameters_with_adam(…) there is no track of the cost
In the lectures also the formulas are without cost reference
I do not understand, it can be omitted?
Thks for help

Can you point out where is this part so I can have a look at that function?

Sorry it’s exercise 6 update parameters with Adam the formulas are at the beginning

In update_parameters_with_adam, we have:

grads -- python dictionary containing your gradients for each parameters:

There is no such thing as dJ. There is dJ/dW (also dW). This is the derivative of cost w.r.t. W and is available in grads dictionary.

Best,
Saif.

Thks for the previous… I was mistaken dJ with the cost function

I have also this one

Make sure you are implementing all these equations correctly (for W and b). Double-check them. It’s a little bit perplexing.

\begin{cases} v_{dW^{[l]}} = \beta_1 v_{dW^{[l]}} + (1 - \beta_1) \frac{\partial \mathcal{J} }{ \partial W^{[l]} } \\ v^{corrected}_{dW^{[l]}} = \frac{v_{dW^{[l]}}}{1 - (\beta_1)^t} \\ s_{dW^{[l]}} = \beta_2 s_{dW^{[l]}} + (1 - \beta_2) (\frac{\partial \mathcal{J} }{\partial W^{[l]} })^2 \\ s^{corrected}_{dW^{[l]}} = \frac{s_{dW^{[l]}}}{1 - (\beta_2)^t} \\ W^{[l]} = W^{[l]} - \alpha \frac{v^{corrected}_{dW^{[l]}}}{\sqrt{s^{corrected}_{dW^{[l]}}} + \varepsilon} \end{cases}

I double checked 4 times…no clues

Send me your update_parameters_with_adam code in a private message. Click my name and message.

Thank you for sending me your code.

OK. We have “squared” terms in multiple places. You are not squaring them. Check the given formulas with your implementation. Also, epsilon is not a part of square root. Compare your implementation, word by word, with the given formulas.

I sqaured but do not know with copy and paste didn’t report

** ← i written like this believe me

OK! Now correct them and update us here.

It was only epsilon as you said
Thks Saif