RMSprop in weight update - what if vertical slopes small and horizontal slopes large?

thanhlamtrinh · September 16, 2021, 12:25pm

Hi Everyone,

I have a question regarding the intuition behind RMSprop,
As shown in the lecture video of Deep Learning specialization by Prof. Andrew Ng, RMSprop helps to reduce the oscillation (the values of the vertical slope b as in the example figure), and speed up the convergence into the minima point through stepping long horizontal axis,

This is achieved by update our weights as:
w:= w - \frac{d_{w}}{\sqrt{S_{dw}}}

b:= b - \frac{d_{b}}{\sqrt{S_{db}}}

So, if initially W is small so \sqrt{S_{dw}} is small, then W will take larger step (moving forward in horizontal direction) and b is large \sqrt{S_{db}} is large, then b will take much smaller step (moving forward in verticaldirection).

However, what if W is large and b is small? Then the optimization algorithm will become strongly fluctuating or diverging again?

nramon · September 16, 2021, 3:31pm

Hi, @thanhlamtrinh.

From what I understood, there are indeed situations where a certain optimization algorithm may not be the best option. The important thing is whether they’re representative of the error surface of your problem.

The example presented by Professor Ng is probably a good approximation for a common situation when training a neural network.

Here’s a very interesting post by @jonaslalin on this topic.

Good luck with course 2

thanhlamtrinh · September 19, 2021, 1:39pm

Hi nramon,

Thank you a lot for your helpful reply, with you all the best,

Topic		Replies	Views
Week 2 RMSprop intuition Improving Deep Neural Networks: Hyperparameter tun coursera-platform	5	617	May 11, 2022
Question about RMSprop Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	276	December 17, 2023
RMSProp horizontal direction Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	488	November 2, 2021
Checking Intuition: RMSprop Normalization vs Speed Improvement (Post: RMSprop lecture) Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	682	October 10, 2022
Does RMSprop slow down learning when we should take larger steps in a direction? Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	544	August 23, 2022

RMSprop in weight update - what if vertical slopes small and horizontal slopes large?

Related topics