Optional Lab: Feature Engineering and Polynomial Regression (Feature scaling impact on Convergence)

Waleed_Saleh · September 2, 2022, 4:53am

Hi Mentors, Fellow learners,

I have a question on the feature scaling example provided in the feature lab. It is mentioned that feature scaling allows the algorithm to converge faster.

However, in the example provided after feature scaling, the # of iterations is 100,000 compared to 10,000 without feature scaling and even the learning rate has been increased.

To check only the effect of feature scaling, I kept both # Iterations and Learning rate constant (i.e. same value used before feature scaling).

As shown below, the algorithm after feature scaling isn’t even close to converging with the same values that algorithm without feature scaling converged.

Can someone please explain why this is happening and how feature scaling helps convergence?

After Feature Scaling:

TMosh · September 2, 2022, 6:03am

The benefit of feature normalization is not that it speeds up performance given the same conditions.

The benefit is that normalization allows you to use a larger learning rate without risk of the solution diverging. This lets you use a larger learning rate and fewer iterations.

shanup · September 2, 2022, 11:04am

Our primary need is to converge faster to the optimal cost.

As such, there are no limitations on what value we can set for the learning rate. However, the issue of divergence does become a limiting factor while trying to set very high values for the learning rate.

As @TMosh has explained, by normalizing the features, it gives us more leeway to set higher learning rates, but without the risk of diverging.

So, it is not the normalization of features in itself that speeds up the convergence - Rather, it is that normalization relaxes the upper limit on the learning rate. We take advantage of this by setting a higher learning rate which in turn facilitates faster convergence.

Topic		Replies	Views
The Effect of Feature Rescaling on Convergence Supervised ML: Regression and Classification week-module-2	1	488	July 27, 2022
The relation between scaling and learning rate Supervised ML: Regression and Classification week-module-2	3	541	March 27, 2023
Why feature scaling can make the learning rate large? Supervised ML: Regression and Classification	8	567	July 15, 2022
Interpreting the benefits of feature scaling Supervised ML: Regression and Classification week-module-1	18	622	February 9, 2023
Is my understanding of Feature Scaling correct? Supervised ML: Regression and Classification week-module-2	3	535	August 12, 2022

Optional Lab: Feature Engineering and Polynomial Regression (Feature scaling impact on Convergence)

Related topics