Clarity on learning rate slides

Hello! I need some clarity on the following.

This slide talks about learning staying constant (first graph):

This slide talks about learning rate reducing as it gets closer to the minimum (see graph):

The equations are the same, right? How does it stay constant vs. reduce over time?

Here in this figure the slope is decreasing constantly i.e. d(J(w))/dw due to which although Alpha is constant gradient descent changes hence it slows down near minima as near minima slope tends to 0.

How does it stay constant in the top slide, top graph, as it gets close to the minimum?

The learning rate (alpha) is a constant.
Since the derivatives (the slope) reaches zero at the bottom of the curve, the progress toward the minimum keeps decreasing.

Learning constant alpha is constant but derivative is changing so gradient descent is changing in first slide also. But it is only talked about alpha there which is constant in both slides

Hello @Buddhima!

Maybe reading my this article will clear your concepts. Give it a read.


Thanks everyone! Thanks for the link Saif.

tell me if I got this right - provided the learning rate isn’t too large, j(w) will reduce with each iteration.

The learning rate stays constant through each iteration.

Is there a scenario where J(w) reduces in equal amounts? Like in slide 1 top graph? Or does this not matter? :slight_smile:

Yes, you are correct.

Don’t read too much into the details of the first graph. It’s just a sketch.