Does RMSprop slow down learning when we should take larger steps in a direction?

Sara · June 5, 2021, 7:36am

In the course we learn about how RMSprop reduce oscillation by scaling down the steps with large oscillation. This is done through scaling down learning in the direction with large gradients. But what if the large gradients are due to fast increase in the particular dimension, wouldn’t RMSprop hurt learning?

For example, in the graph below, no matter where we start out, the square of the gradients in b dimension is always larger than the that in the w dimension, so learning in the b direction would always get scaled down more.
If we start out at A, RMSprop would reduce the gradients more in dimension b, allowing us to use larger learning rate to increase the learning in dimension w.
If we start out at B, RMSprop would again reduce the gradients more in dimension b. But in this scenario, it seems to slow down learning.

AnkitSaini · June 6, 2021, 5:34pm

@Sara
RMSprop automatically adjusts the effective learning rate based on the exponential moving averages of gradients. Effective learning rate will be higher on a flat surface (no sudden change in gradients).
The effective learning rate will be reduced when there are sudden changes in gradients. This would not hurt the learning but help in reducing oscillations during training.

Hope this clarifies your doubt.

realnoob · August 23, 2022, 6:34am

what is the ‘effective learning rate’? can you explain in more detail how the algorithm will behave if you start from point B?

Topic		Replies	Views
Question about RMSprop Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	276	December 17, 2023
RMSProp horizontal direction Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	488	November 2, 2021
RMSprop in weight update - what if vertical slopes small and horizontal slopes large? Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	607	September 19, 2021
Week 2 RMSprop intuition Improving Deep Neural Networks: Hyperparameter tun coursera-platform	5	617	May 11, 2022
Checking Intuition: RMSprop Normalization vs Speed Improvement (Post: RMSprop lecture) Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	682	October 10, 2022

Does RMSprop slow down learning when we should take larger steps in a direction?

Related topics