Collaborative Filtering doesn't give each user the same weight

In the collaborative filtering section, the cost function combines the cost function to learn w(1), b(1), …, w(n_u), b(n_u) and the cost function to learn x(1), … , x(n_m).

In the cost function to learn w(1), b(1), …, w(n_u), b(n_u), we have a normalizing term 1/2 on the outside. For each individual user’s loss, the normalizing term shall be 1/(2* no. of movies rated by user j). It won’t affect if we remove the constant (no. of movies rated by user j) because we are minimizing each w(i), b(i) which will only affect each user’s loss.

However, when we combine these two cost functions together, it seems give more considerations to users who rate a lot of movies than users who rate only one or two movies because now each w(i), b(i) are no longer independently estimated but interact with x(1) through x(n_m).

The collaborative filtering seems to give each rating the same weight, but not give the same weight to each user. A user who rates a lot wins a lot of weights in the system.

Hello @ken2022,

I think we are discussing this formula:

Screenshot from 2022-08-16 06-32-17

First, it’s right to say that users rated more movies have more loss terms in the cost function, and each loss term carries equal weight.

Second, your arguments for equal-weighting-per-user sounds good, but whether to accept it or equal-weighting-per-user-movie, this is a which-performs-better question, so I suggest us to be more open-minded here, and test the 2 approaches before any final decision.

Third, sometimes, an user who rates properly and more over a long period of time may be a more reliable source of ratings and we might thus somehow want to give those users higher weights. There is no exact answer to how we weight them, but your idea and the assignment idea are both good starters.


1 Like

Thanks Raymond. I now understand why the equal-weighting-per-rating is more appropriate in this case.