Course 2, week 2, update_parameters_with_momentum() issue

I​ got my code pass the checker but don’t entire understand the math behind this.

v[“dW” + str(l)] = beta * v[“dW” + str(l)] + (1 - beta) * grads[“dW” + str(l)] (1)
which is supposedly the correct implementation is the same as:
v[“dW” + str(l)] = (1 - beta) * grads[“dW” + str(l)] (2)
s​ince v[“dW” + str(l)] is initialized = 0.

I​ tried (2) and pass all test.

S​hould it be v[“dW” + str (l-1) ] for l>1 and just 0 for l=1, as we take ‘beta’ part of the LAST momentum and give it a bit more acceleration?

A​m I understanding this correctly?

Your formulas 1) and 2) are not equivalent. Note that v[“dW1”] is not the same thing as grads[“dW1”]. Also note that these methods are iterative, right? So the fact that the velocity is initialized to zero is not relevant after the first iteration.

1 Like

For a second I confused an iteration with a layer. I realized that immediately after moving on to the next function in the exercise. This is really helpful. Thank you for fast response, sir.