Normalization before a new prediction?

I am currently working on a multivariate linear regression.
I’m a bit confused about the normalization (mean=0 and standard deviation=1). If we do normalization on our training set, shouldn’t we also normalize our cross-validation set as well as all the new values of X that the model does not know?

I have the same question for image classification with a CNN. If we do pre-processing on our training images, shouldn’t we do the same pre-processing on a new image that the model does not know?




Yes, every transformation of the data that you do in training should be done on the new data as well.

Hi, @Pierre_BEJIAN !

Just one more thing to add to the post. You should normalize your test data with your training data mean and std dev. In a real world scenario you don’t know how your input data is distributed, so you cannot rescale it with any other values than the ones from your training.

You are totally right! I had guessed it but it’s good to have confirmation.
Thanks a lot