Doubt regarding Exercise 4, Week 2 Lab

Syed_Taha · May 2, 2024, 11:47am

I was just wondering whether this formula is correct because we take the previous layers velocity and multiply by beta instead of current layers velocity in weighted moving average

Kic · May 2, 2024, 1:59pm

Hi @Syed_Taha ,

Where do you see the previous layers velocity in the formula?

Syed_Taha · May 2, 2024, 9:26pm

Kic · May 3, 2024, 2:46pm

Hi @Syed_Taha ,

The formula from your first post is the formula for gradient descent with moment which is one of the optimization algorithms used in machine learning. If you refer back to the video lecture, you would hear Prof Ng talked about the advantage of averaging the gradients to help finding the global minimum faster and less oscillation. This averaging technique is the ‘exponentially weight averages’. You would also hear Prof Ng talked about the V_t is taken at iteration t when running a mini-batch. Attached is a couple of screenshots for your reference.

Topic		Replies	Views
Gradient descent exponential weighted average Improving Deep Neural Networks: Hyperparameter tun	1	543	May 11, 2022
Moving Average & Momentum Improving Deep Neural Networks: Hyperparameter tun	1	513	February 2, 2023
Gradient Descent with Momentum-Last part Improving Deep Neural Networks: Hyperparameter tun week-2	2	30	November 27, 2024
Implementing exponentially weighted averages Improving Deep Neural Networks: Hyperparameter tun	3	521	April 5, 2023
Update_parameters_with_momentum Improving Deep Neural Networks: Hyperparameter tun	4	471	June 8, 2023

Doubt regarding Exercise 4, Week 2 Lab

Related topics