Is it necessary to normalize target in training data?

kkkk · July 6, 2022, 8:28am

Hi there. I am learning Multiple Linear Regression and feature scaling.
I can understand that it is necessary to normalize input_features, when features have different scales and are updated unevenly.
I am wondering should we normalize the target value during training.

From my perspective, I think target value needs to normalize. Because, the linear model is like f(x) = wx + b, and the cost is like J = [f(x) - y]^2, after x is rescaling to -1 ~ 1 and y still is very large, f(x) might be extremely small, this may make the cost be very very large. All in all, if we don’t scale target, feature and target are not in same scale. It looks quite strange.

But, in the coding example. I found only the features are normalized. This makes me quite confused.

Thanks in advanced

rmwkwok · July 6, 2022, 9:01am

Hello @kkkk,

The bias term b levels up f(x) to match with the mean level of y. On the other hand, even though x is ranged between -1 and 1, w can stretch x to a larger range to match with the variance of y.

Cheers
Raymond

kkkk · July 6, 2022, 9:24am

Thanks, @rmwkwok, now I can understand that the model can work without normalizing the targets. However, do you think normalizing target can somewhat accelerate the training?

rmwkwok · July 6, 2022, 10:09am

Hello @kkkk,

Usually we initialize w to be around between -2 and 2, and b to be 0. If the optimal w and b are in these ranges, ofcourse, you need less steps to converge, when comparing to the case that the optimal parameters are far away because they need to stretch a lot and level up a lot. So, if your y is normalized, then the optimal paramters will be closer to those ranges and therefore less steps!

However, we always only talk about normaling features because sometimes it really can give you a hard time choosing a good learning rate for it to converge. However, after you normalized features, learning rate will be pretty easy to choose, and even if you don’t normalize y, it still can converge, perhaps it would just take you an extra few or a few tens more iterations compared to normalized y. However, that kind of iterations is nothing to a computer.

Cheers
Raymond

rmwkwok · July 6, 2022, 10:37am

Hey @kkkk, seems that your other question actually touches what I mentioned in my last reply about choosing learning rates.

Raymond

kkkk · July 6, 2022, 12:55pm

Many thanks. @rmwkwok

Yes, with your answer in these two posts, I can feel the importance of normalizing feature, while the unnecessity to normalize target.

Topic		Replies	Views
Question about feature scaling Supervised ML: Regression and Classification week-2	5	33	August 17, 2024
Zscore normalization Supervised ML: Regression and Classification week-2	4	543	October 12, 2022
Feature scaling - only in x training? Advanced Learning Algorithms general	5	100	June 15, 2024
Feature scaling and target values Advanced Learning Algorithms week-3	3	22	January 10, 2025
Week 2 lab 3 y_train Supervised ML: Regression and Classification week-2	5	47	April 2, 2025

Is it necessary to normalize target in training data?

Related topics