The Effect of Feature Rescaling on Convergence

ahmedelmasry1985 · July 27, 2022, 3:46pm

Good day,
I decided to write the entire example to practice the Linear Regression better than the optional labs.

I noticed that sometimes the w & b reach Nan or Inf after 100 iterations or less with alpha 0.1e-3, and it was very hard to find a better alpha and iterations for the algorithm.

On the other hand, when I normalized the X (though my data has just one feature with range 200:600), it took 1k iterations with alpha 0.1 to converge.

So my question here, Does the rescaling really affect the convergence accuracy and speed? How and Why? I don’t get it.

TMosh · July 27, 2022, 4:17pm

Yes, the learning rate is closely related to the magnitudes of the features. This is because the gradients are computed by the product of the errors and X(i). Then you multiply the gradients by the learning rate to get the change in the weight values.

When you normalize the features, this allows you to use a larger learning rate without any individual feature causing the solution to diverge.

Topic		Replies	Views
Optional Lab: Feature Engineering and Polynomial Regression (Feature scaling impact on Convergence) Supervised ML: Regression and Classification week-2	2	532	September 2, 2022
The relation between scaling and learning rate Supervised ML: Regression and Classification week-2	3	536	March 27, 2023
Is my understanding of Feature Scaling correct? Supervised ML: Regression and Classification week-2	3	528	August 12, 2022
Why feature scaling can make the learning rate large? Supervised ML: Regression and Classification	8	521	July 15, 2022
Error in Optional lab: feature scaling and learning rate Supervised ML: Regression and Classification week-2	1	365	September 5, 2023

The Effect of Feature Rescaling on Convergence

Related topics