Confusing on normalisation

LimXiuXian96 · July 19, 2021, 9:02am

Now, I am kinda confuse on what are the difference between minmax standardization, z score normalization and transformation such as box cox transformation. How do we know to choose which method in normalizing the data to prevent skewness?

nramon · July 20, 2021, 11:53am

Hi, @LimXiuXian96.

Normalization does not alter skewness. If you want to make your data more normal distribution-like, this may be helpful. If you’re worried about your dataset being imbalanced, take a look at this.

You may be interested in the Practical Data Science Specialization

nramon · July 20, 2021, 8:57pm

I forgot to include this link in case you were asking about the different normalization methods and their uses

LimXiuXian96 · July 21, 2021, 6:36am

From the lectures, i know that normalization enable us to make sure that training is much more faster to achieve lower cost. But i am still confusing on how skewness in data might affect us in training out deep learning model. Does it have any effect on training?

nramon · July 21, 2021, 11:17am

If you’re referring to the distribution of the individual features, I’d say it doesn’t. But I’ll see if I can find any references

LimXiuXian96 · July 21, 2021, 11:32am

Ok noted, thanks a lot for the info provided.

suki · July 21, 2021, 2:09pm

@LimXiuXian96

All your questions are very important and demonstrate your understanding of how the data affects the model performance.
I’d like to point out that there is a critical difference between standardization and normalization.
Standardization concerns the scale or unit of a feature, where normalization corrects for the distribution of a feature. For example, min-max standardization fits the feature to be within [0, 1] but this does not mean it will be normally distributed. It may still be skewed. On the other hand, a z-score is a normalization which attempts to fit the feature to be normally distributed. As result, most of the data points will fall between 2 \sigma away from the mean, that said it is most likely not between [0, 1]. Here, you will have less skewness as result.

Which method you choose depends on your application and what the model requires. For example, if you want the feature to be non-negative, min-max may be better than z-score. (z-score can take a negative value.)

You are correct that normalizing/standardizing often leads to better performance and efficiency.
For a simple example, think of a linear model. Though it’s not imperative that each feature is normalized, you can easily see that the normalized model will be much simpler to compute and to understand. (The weights will be also on the same scale.) It is the same, if not more, important for larger, more complex models like deep learning models.

Skewness, on the other hand, should not affect performance in terms of training efficacy. But here is what you should remember about the skewness. For skewed data, the train-dev-test split needs to be more carefully done to make sure all the splits represent all the data range across the splits. You don’t want to end up with a situation where the test splits contain no negative samples, for example.

I hope that helps. Let me know if you have any questions.

LimXiuXian96 · July 21, 2021, 11:44pm

Thanks a lot for the info provided !

LimXiuXian96 · July 22, 2021, 11:19am

So based on my current understanding, to recap:

z-score transformation/min max standardization speed up the training of deep learning model because it actually RESCALE it to a suitable range hence avoiding much higher gradient wrt one feature than the gradient wrt other features and toughen the optimization process. (Take longer time and steps to achieve low costs)
skewness does not affect the deep learning model(neural net due to its universal approximate functionality) but it will affects other machine learning statistical models such as linear regression/logistic regression, right?
z score transform does not alters skewness, ie it does not change the distribution of a feature(only tell about how far from the mean of a feature), right?

Pls correct me if any wrong, thank in advance.

suki · July 23, 2021, 1:29am

@LimXiuXian96
I think you are right on!
Normalization does not correct for the skewness. And you are right on point about the Universal approx. theory.
And correcting skewness WOULD be helpful if you are doing a very simple linear regression since it assumes the features are normally distributed.

Topic		Replies	Views
Impact of Feature Scaling on underlying distribution Supervised ML: Regression and Classification week-module-2	7	253	April 16, 2024
Input data normalization Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	707	July 12, 2022
Feature Scaling - What method to choose? Supervised ML: Regression and Classification week-module-2	3	396	August 18, 2023
Mean Normalization VS other forms of Feature Scaling Supervised ML: Regression and Classification week-module-2	2	530	July 28, 2022
Course 2 Week 3 Homework Normalization Question Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	537	September 2, 2022

Confusing on normalisation

Related topics