My doubt is when I should use Z-score normalisation and when I should use mean normalisation ??
Please elaborate as to why we need both and for all practical purposes which normalisation we should use ?
Regards
My doubt is when I should use Z-score normalisation and when I should use mean normalisation ??
Please elaborate as to why we need both and for all practical purposes which normalisation we should use ?
Regards
This is a duplicate of other questions on the same topic.
The choice between these methods often comes down to the structure of your data and the specific algorithm youâ€™re using, as you asked before on the same topic. Z-score normalization is useful when you care about the standard deviation and need normalized distributions, while mean normalization is simpler and often effective when you just want to rescale between a fixed range, such as -1 to 1. You may want to
Use z-score normalization when your data follows a Gaussian distribution, or when algorithms (such as k-means clustering or logistic regression) assume the features are normally distributed (e.g., many machine learning algorithms such as SVMs or neural networks).
Use mean normalization when the data distribution is unknown or arbitrary (non-Gaussian distributed data) and you just need to scale the features in a way that balances their impact on model training (e.g., recommender systems). Itâ€™s often sufficient for algorithms that donâ€™t make strong assumptions about the data distribution, and you only care about centering the data and making sure itâ€™s in a certain range (like -1 to 1).
Sir,
Do you mean Logistic regression expects us to give data that follows Gaussian distribution as the input?
Also SVMs ??
I am surprised Dr Andrew never mentioned this in the course.
There was no mention of using z-score specifically for Logistic regression
You are right to be surprised - logistic regression and SVMs do not require the input data to follow a Gaussian (normal) distribution. My earlier explanation may have confused you about when z-score normalization is appropriate. Z-score normalization isnâ€™t required for logistic regression or SVMs, but feature scaling in general is helpful to ensure that the optimization process works efficiently. Itâ€™s not about the distribution of the data, itâ€™s about ensuring comparable feature scales to avoid bias in gradient-based optimization.
Dr. Andrew Ng doesnâ€™t specifically mention in the course that z-score normalization is required for logistic regression or SVM, because these algorithms donâ€™t depend on the data being normally distributed. The course emphasizes feature scaling in general because it helps optimization algorithms, such as gradient descent, converge faster. Whether you use mean normalization or z-score normalization depends more on your dataset and the specific problem youâ€™re solving.
Hope this helps!