Data input normalization

Hi!
In the course it is emphasized that features of input X should be normalized before running a NN model. Explanations given are clear to me. But I’m wondering, what about the response variable Y, for example in non classification problems such as regression?
My intuition is that input Y should be left unchanged. But one example I have in mind is when Y can be potentially big compared to normalized features in X - could this then be a problem to a NN?

Thanks!

1 Like

Hi @Matim,

In most cases, you would do just fine without normalizing “y”. However, it could cause exploding weights in NN due to the large error values.

Best,
Bahadir

2 Likes

Hi @Matim,

If you want to study an example where output normalization achieves better results, I recommend you to check out Effect of transforming the targets in regression model — scikit-learn 0.24.2 documentation.

3 Likes

Also fun to read :nerd_face:

2 Likes

`

I think that is a very good point. I do believe having a target variable in a totally different scale makes it harder to train.
To add my two cents,
In practice, I simply end up scaling the target. Note that it is a bit different from normalizing . Rescaling means the values are made to be within the same range. Normalizing can entail a certain degree of change in distribution, which is not always desirable. You want to make sure your target variable still retains the distribution of the original data.

3 Likes

Hi Suki,

Thanks for the additional input. I didn’t go into particulars much.

Scaling and normalization each have their advantages and disadvantages. I believe, from min-max scaling which is also a normalization technique, you can revert back to the original values. But as you warned, there are methods like L2 normalization that wouldn’t make it possible.

Best,
Bahadir