Confusion Regarding Mean normalizarion

Bibek_Joshi · February 5, 2024, 1:55pm

Hi can some one help me with this. I didn’t understood that after subtracting the mean value do we use the new matrix as the y(i,j) for each W,B,X or just for the users that don’t have any data filled or rated any movie.

And if we do use it as new y(i,j) won’t the value also be in -ve as multiplying with value of any W having weight greater than 1 will increase the -ve value and just adding the corresponding mean render a -ve value instead of positive one.

hackyon · February 5, 2024, 6:26pm

We will be using the y(i,j) for all users.

I don’t quite follow your logic here. The paramaters (W and B) can be positive or negative, so when you multiply it out with X (which can also be positive or negative), the resulting Y can be positive or negative.

Bibek_Joshi · February 5, 2024, 6:34pm

Doesn’t the value of y give us the output how much a user might rate a movie. How can that rating be negative.
Can you please explain if I am missing something.
Thank you

hackyon · February 5, 2024, 6:45pm

So there are 2 versions of y: 1) the y_target and 2) the y_prediction.

The y_target is the actual rating given by a user (real user data). Without mean normalization, y_target will not be negative. With mean normalization, since we are subtracting away the mean, y_target can be negative.

The y_prediction is the rating predicted by the model. It is not real user data, just a prediction from the model. It is possible for y_prediction to be negative since it is just a prediction, and the model can simply make a wrong prediction. With that said, a well-trained model is unlikely to output a negative number, but it is still a possibility.

TMosh · February 5, 2024, 6:49pm

Since we normalized the rating in the training set, you can “de-normalize” the predictions by just adding back the mean value.

It doesn’t change the quality of the predictions, it just re-scales them.

Bibek_Joshi · February 5, 2024, 7:00pm

Let’s say the mean is 2 for some row.
Then a correct model will only produce a normalized value between -2 to 3 as the rating is between 0 to 5, and de-normalizing is done by adding the mean hence the value will be between 0 to 5.
Is this intuition of mine correct.

hackyon · February 5, 2024, 7:07pm

Yes, that intuition is correct.

One thing to note is that, in practice, a model may or may not be always “correct”. It can make mistakes, and so it is possible for it to produce normalized values outside of -2 to 3 (although this is unlikely to happen for a good model). In practice, you would want to write code to correct any predictions outside of the range (for example, if a bad model outputs -2.1, you can write code to adjust it to -2 instead).

TMosh · February 5, 2024, 7:09pm

One added note, this is an inherent flaw in using a linear output for what is essentially a classification exercise.

If you want ratings to only be integer values between 1 and 5, then those should be treated as classes, not real numbers.

You cannot constrain a linear output to not give you values that are outside the expectations from the training set.

Bibek_Joshi · February 5, 2024, 7:14pm

Won’t applying this constrain produce a model which is not able to make good prediction.

TMosh · February 5, 2024, 7:34pm

What constraint are you referring to?

I said that you cannot constrain a linear output to a specific range.

For example, if the model learns that it can minimize the cost on the training set by allowing for outputs that are outside the range of the expected values, that’s what you’re going to get. It’s a characteristic of using a linear model.

Bibek_Joshi · February 5, 2024, 7:37pm

Sorry, I misread it as “you can constrain” instead of “you cannot constrain” .

Bibek_Joshi · February 5, 2024, 7:40pm

Thank you @hackyon and @TMosh for your help.

Topic		Replies	Views
Week2 Lab1 - normalizeRatings function in collaborative filtering Unsupervised Learning, Recommenders, Reinforcement week-module-2	3	378	August 4, 2023
C1_W2_Lab03_Predicting targets with normalised values Unsupervised Learning, Recommenders, Reinforcement week-module-2	4	252	January 24, 2024
Recommender systems mean normalization Unsupervised Learning, Recommenders, Reinforcement week-module-2	3	592	September 6, 2024
Can someone help explain this line Supervised ML: Regression and Classification week-module-2	8	438	July 27, 2023
C3_W2 - Practice Lab 1: Mean Normalization Unsupervised Learning, Recommenders, Reinforcement week-module-2	6	576	August 24, 2022

Confusion Regarding Mean normalizarion

Related topics