How does scaling makes the gradient decent faster?

rmwkwok · July 10, 2022, 11:40am

If you don’t normalize and features have very different scales, you can still optimize the model given that you use a sufficiently small learning rate and that’s the cause of taking more steps - smaller learning rate, more steps.

For why we need a smaller learning rate, I think it’s easier to visualize it. This discussion used a slide to explain this…

Raymond

Topic		Replies	Views
About gradient descent and Features scaling Supervised ML: Regression and Classification week-module-2	6	575	August 19, 2022
The relation between scaling and learning rate Supervised ML: Regression and Classification week-module-2	3	552	March 27, 2023
Interpreting the benefits of feature scaling Supervised ML: Regression and Classification week-module-1	18	624	February 9, 2023
Because when I change the scale of the functions, the algorithm converges faster AI Discussions ai-discussions	1	65	March 19, 2024
Graph in optional lab : feature scaling and learning rate Supervised ML: Regression and Classification week-module-2	2	517	March 3, 2023

How does scaling makes the gradient decent faster?

Related topics