Feature Scaling - When to Scale

mvrbiguv · June 14, 2023, 9:31pm

Why is it a problem if one feature achieves its lowest value faster than others? I suppose we’ll need more iterations to have the other parameters converge which requires more resources but are there other disadvantages that I am missing?

If all features of a model are within a certain range, say, 95-105, is feature scaling still recommended?

Mujassim_Jamal · June 15, 2023, 7:38am

You may not be able to use a large learning rate in such cases.

This range is still too large. You can consider scaling it.

mvrbiguv · July 13, 2023, 12:24pm

For the first question about one feature achieving its lowest value faster than others, could you elaborate on why we wouldn’t be able to use a large learning rate?
For the second question, are there any guidelines on what ranges are considered large?
Again for the second question, if values for all features lie within a small range, is feature scaling still necessary?

rmwkwok · July 13, 2023, 8:43pm

Hi @mvrbiguv,

This post discussed the relationship between learning rate size, feature scales, and cost contour regularity. You will also notice that the discussion is based on a lecture slide so you might review some lectures again for more explanation.

The key of feature scaling is for all features to span over a similar range, not the best range and not a small range. Usually people applies one of the first three methods in this wikipedia section to all features for the job. Those three methods will all result in a “small range” around zero, but being small is not the key, having a similar range among all scaled features is the key.

If you scale all features to a similar range, you might achieve a more regular cost contour ( as examplified in the lecture slide quoted in the linked post), if you exempt some features from scaling, you take the risk of having a less regular contour.

I recommend you to try to answer your own questions by doing some real experiments on different datasets. That will give you a more concrete idea, and you will be able to practice what you are trading off when not scaling each and every one of the features using the methods I mentioned in that wikipedia section.

Raymond

Topic		Replies	Views
Feature Scaling: Why not use separate learning rates instead of rescaling features? Supervised ML: Regression and Classification week-2	1	385	August 5, 2023
Feature Scaling Supervised ML: Regression and Classification week-2	6	570	July 3, 2023
Is my understanding of Feature Scaling correct? Supervised ML: Regression and Classification week-2	3	529	August 12, 2022
Doubt in Feature scaling Supervised ML: Regression and Classification week-2	7	580	November 5, 2022
About gradient descent and Features scaling Supervised ML: Regression and Classification week-2	6	553	August 19, 2022

Feature Scaling - When to Scale

Related topics