HELP - Something not clear with momentum gradient decent

Roee_Ben_Shoshan · August 9, 2023, 3:21pm

Hello,
In the video explaining momentum he showed the following formulas and i dont understand why are they correct.
You initialise vdw to 0, and than compute vdw using beta*vdw + (1-beta)*dW.
I dont understand why we multiply the whole derivative of of W? dont we suppose to use the derivative of the last 10?

thank you.

paulinpaloalto · August 9, 2023, 5:35pm

We aren’t using “the whole derivative of W”: we’re multiplying it by (1 - \beta) on each iteration. That was all explained in the lectures: the effect of primarily depending on the recent values is what Exponentially Weighted Averages do for you. That’s the formula with \beta and (1 - \beta) as the factors there. Prof Ng devotes several lectures to explaining how EWAs work and how to apply them for purposes like Momentum here.

Roee_Ben_Shoshan · August 10, 2023, 8:16am

I dont think you understood me.
My question is why were using dW and not dW[:,i:i+1] for example.

paulinpaloalto · August 10, 2023, 11:49am

dW is not indexed by iteration, right? The dimensions are neurons out by neurons in. The point about the influence of recent iterations is handled by the EWA algorithm as you iterate over the loops. That is the purpose of \beta in the formula you show. I may still be missing your point, but I really think you should watch all the lectures about EWA again.

paulinpaloalto · August 10, 2023, 3:16pm

I’m referring to the lecture Understanding Exponentially Weighted Averages in DLS C2 Week 2. That exactly addresses my interpretation of the question you’re asking.

Roee_Ben_Shoshan · August 11, 2023, 8:26am

Ill check it out again and update, thanks!

Topic		Replies	Views
Gradient Descent with Momentum (formula) Improving Deep Neural Networks: Hyperparameter tun	2	520	November 13, 2022
Momentum Gradient Descent question Improving Deep Neural Networks: Hyperparameter tun	5	614	December 23, 2022
Course 2, Week 2, suggest for Gradient Descent with Momentum Improving Deep Neural Networks: Hyperparameter tun	1	291	November 14, 2023
Implementing exponentially weighted averages Improving Deep Neural Networks: Hyperparameter tun	3	520	April 5, 2023
Momentum Formula Improving Deep Neural Networks: Hyperparameter tun week-2	5	207	May 18, 2024

HELP - Something not clear with momentum gradient decent

Related topics