In video about mean normalization it was shown to normalize the ratings but how is it actually good
for example if we normalize everything people that rated a movie 5 got this rating set to 2.5 and when we update new user we also made him have 2.5 for this move because that was mean of values but now he has the equality of 5 in comparison to users that actually rated it 5
Hello @Michal_Bober,
The mean is just believed to be a better guess than zero. It is also simple and intuitive. It simply assumes a new user’s preference to be on an averaged level. It is not personalised and it does not guarantee performance.
So, with such a simple trick of mean normalisation, we are able to assign an averaged rating to every movie for a new user, and this is considered an improvement.
Cheers,
Raymond
Hello,
But why don’t we use our regular algorithm without mean normalization and than if it is a new user we can directly predict the user’s rate for the movie as mean of that row. Does mean normalization also make improvements on users that has rated movies? If yes can you please explain how?
In your case, we will have and maintain two procedures in our production system, and whether this is preferred or not is to be considered.
- for old user, it goes through the model
- for new user, it loads the mean, with the mean being pre-computed already
I don’t think the normalization will make a lot of difference in the performance, but it is generally a good practice to normalize to avoid large numbers in the process of gradient descent. We might have only 1 - 5 here, but can we have 1 - 100 in another system?
Speaking of performance, you see, what mean normalization does is to have an additional movie-based term in the model, only the term isn’t by training. Performance boost might be seen if we relax the term to be trainable.
Cheers,
Raymond