Moving Average & Momentum

yeoh_zhewei · February 2, 2023, 4:13am

Above is the formula for gradient descent with momentum. However, it is different from the exponentially moving average formula which confuses me. In the formula in the image above, it multiplies beta by the vdw which is the value of the gradient in the current layer and not the previous layer.

I would greatly appreciate any help!

AbdElRhaman_Fakhry · February 2, 2023, 5:26pm

Hi @yeoh_zhewei
The exponentially moving average use to decrease the oscillation by using for example this formula X(new)=\beta X(old) + (1-\beta)X(calculated) and we assume that the beta is 0.8 that mean that new X consists of 0.8 from the old X and only 0.2 of the calculated X ,

In the other word that new X is strongly related with the old X by 80% , that used in the time series to make the line more smooth and decrease oscillation in the line of this graph
,
in this image

the V_{dw}(new)=\beta V_{dw}(old) + (1-\beta)dw(calculated), so that V_{dw} is the value of the gradient of the current layer but it dependent on the last values V_{dw}(old) calculated by rate of \beta and dependent on the new calculated value dw(calculated) with rate of (1-\beta), you can specify how many values the new V_{dw} is dependent with V_{dw}(old) by this equation \frac{1}{(1-\beta)} for example if \beta is 0.8 and the equation is V_{dw}(new)=\beta V_{dw}(old) + (1-\beta)dw(calculated) that mean the V_{dw}(new) is related with the last five calculated values of V_{dw}(old), it used for decrease the oscillation when we calculate the gradient decent values

Regards,
Abdelrahman

Topic		Replies	Views
Course 2, Week 2, suggest for Gradient Descent with Momentum Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	294	November 14, 2023
Update_parameters_with_momentum Improving Deep Neural Networks: Hyperparameter tun coursera-platform	4	472	June 8, 2023
Doubt regarding Exercise 4, Week 2 Lab Improving Deep Neural Networks: Hyperparameter tun week-module-2 , coursera-platform	3	149	May 3, 2024
Momentum Gradient Descent question Improving Deep Neural Networks: Hyperparameter tun coursera-platform	5	623	December 23, 2022
Gradient Descent with Momentum (formula) Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	522	November 13, 2022

Moving Average & Momentum

Related topics