RMSprop in weight update - what if vertical slopes small and horizontal slopes large?

Hi Everyone,

I have a question regarding the intuition behind RMSprop,
As shown in the lecture video of Deep Learning specialization by Prof. Andrew Ng, RMSprop helps to reduce the oscillation (the values of the vertical slope b as in the example figure), and speed up the convergence into the minima point through stepping long horizontal axis,

This is achieved by update our weights as:
w:= w - \frac{d_{w}}{\sqrt{S_{dw}}}

b:= b - \frac{d_{b}}{\sqrt{S_{db}}}

So, if initially W is small so \sqrt{S_{dw}} is small, then W will take larger step (moving forward in horizontal direction) and b is large \sqrt{S_{db}} is large, then b will take much smaller step (moving forward in verticaldirection).

However, what if W is large and b is small? Then the optimization algorithm will become strongly fluctuating or diverging again?

Hi, @thanhlamtrinh.

From what I understood, there are indeed situations where a certain optimization algorithm may not be the best option. The important thing is whether they’re representative of the error surface of your problem.

The example presented by Professor Ng is probably a good approximation for a common situation when training a neural network.

Here’s a very interesting post by @jonaslalin on this topic.

Good luck with course 2 :slight_smile:


Hi nramon,

Thank you a lot for your helpful reply, with you all the best,

1 Like