Week 1 collaborative filtering

in the first row, we are able to calculate X1 and X2 because the movie Love at last is rated by all 4 users. How could the following movie’s X1 and X2 are calculated the same way when the movie’s rating is missing in the second and third movie?

Missing ratings are ignored, so they do not contribute to calculations.

Hello, @flyunicorn,

All x1, x2 (and w1, w2, b) values are initialized randomly, so they must each have an initial value. Then, with them and 15 available ratings, we form 15 equations, and then through optimization, those values are trained such that the errors between the left and right hand sides of each equation are minimized.

Andrew presented four such equations in the bottom of the slide you shared, and I can add two more using the ratings of the second movie:

Therefore, I wouldn’t say that x1 and x2 values are “calculated” by “only one” movie’s ratings. Instead, they are “optimized” by “all” ratings. While it may be most intuitive to think that movie 1’s x1 and x2 are only affected by movie 1’s ratings, the fact is, movie 1’s values are also affected by (e.g.) movie 2’s ratings through their common users (Alice and Dave). If you extend this idea, you will conclude that each movie’s values are affected by all ratings through all common users. I think this can partly explain why we call this approach “collaborative”. You might google for more discussions on how it gets this name for more insights.

Cheers,
Raymond

1 Like