Hi can some one help me with this. I didn’t understood that after subtracting the mean value do we use the new matrix as the y(i,j) for each W,B,X or just for the users that don’t have any data filled or rated any movie.

And if we do use it as new y(i,j) won’t the value also be in -ve as multiplying with value of any W having weight greater than 1 will increase the -ve value and just adding the corresponding mean render a -ve value instead of positive one.

I don’t quite follow your logic here. The paramaters (W and B) can be positive or negative, so when you multiply it out with X (which can also be positive or negative), the resulting Y can be positive or negative.

Doesn’t the value of y give us the output how much a user might rate a movie. How can that rating be negative.
Can you please explain if I am missing something.
Thank you

So there are 2 versions of y: 1) the y_target and 2) the y_prediction.

The y_target is the actual rating given by a user (real user data). Without mean normalization, y_target will not be negative. With mean normalization, since we are subtracting away the mean, y_target can be negative.

The y_prediction is the rating predicted by the model. It is not real user data, just a prediction from the model. It is possible for y_prediction to be negative since it is just a prediction, and the model can simply make a wrong prediction. With that said, a well-trained model is unlikely to output a negative number, but it is still a possibility.

Let’s say the mean is 2 for some row.
Then a correct model will only produce a normalized value between -2 to 3 as the rating is between 0 to 5, and de-normalizing is done by adding the mean hence the value will be between 0 to 5.
Is this intuition of mine correct.

One thing to note is that, in practice, a model may or may not be always “correct”. It can make mistakes, and so it is possible for it to produce normalized values outside of -2 to 3 (although this is unlikely to happen for a good model). In practice, you would want to write code to correct any predictions outside of the range (for example, if a bad model outputs -2.1, you can write code to adjust it to -2 instead).

I said that you cannot constrain a linear output to a specific range.

For example, if the model learns that it can minimize the cost on the training set by allowing for outputs that are outside the range of the expected values, that’s what you’re going to get. It’s a characteristic of using a linear model.