Confused about the momentum updates!

Linfeng_W · April 24, 2021, 8:04pm

In the Wk2 assignment for the DLS course where we were requested to compute the function update_parameters_with_momentum(), I computed the below functions:

However, the v[“dW1”], v[“db1”], v[“dW2”], v[“db2”] lists are all initialized as lists of zeros. The v[“dW”+ str(l)] and * v[“db”+ str(l)]* has always referred to lists of zeros. Hence, the previous results in v[“dW”+ str(l -1)] or v[“db”+ str(l -1)] never seemed to have any bearing over their counterpart in the next iteration.

Therefore, I was wondering if how are the momentum generated if the lth iteration calculation is not taking the v value from the previous (l - 1) iteration calculation?

I am very confused. Did I overlook something?
Not sure if I am explaining my question well, but let me know what you think.

Thank you for helping out.
Linfeng

nramon · April 24, 2021, 10:21pm

Hi, @Linfeng_W.

The averaging takes place across epochs (for simplicity, let’s assume we’re doing batch gradient descent), not across layers. These are the iterations you should be thinking of:

for i in range(num_epochs):
	...
	parameters, v = update_parameters_with_momentum(parameters, grads, v, beta, learning_rate)

The loop inside update_parameters_with_momentum simply updates the parameters of every layer.

Was that helpful?

Linfeng_W · April 25, 2021, 10:58am

Ah, I see! Somehow I thought momentum is carried out between layers. You are totally right that the average should take place across epochs. Makes sense now.

Thank you so much for clearing things up for me.

nramon · April 25, 2021, 3:41pm

Glad I could help. Good luck with the rest of the course

Topic		Replies	Views
Update_parameters_with_momentum Improving Deep Neural Networks: Hyperparameter tun coursera-platform	4	486	June 8, 2023
Course 2, week 2, update_parameters_with_momentum() issue Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	533	November 24, 2021
C2 Wk 2 Question about momentum Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	553	July 10, 2021
Momentum Updates Confusion Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	496	February 8, 2022
Momentum Formula Improving Deep Neural Networks: Hyperparameter tun week-module-2 , coursera-platform	5	230	May 18, 2024

Confused about the momentum updates!

Related topics