Please enlighten me. I don’t quite understand why collaborative filtering is good for recommendation system and is better than normal liner regression with w and b as parameters. From what I understand so far is that the collaborative filtering helps tagging the genres for each movie in range of [0,1] and recommend movies based on what genres are tagged on unwatched movies. w and b represent for users’ interests, and x represents for genres. In long term, the users’ interests could be changed. Meanwhile, the movies’ genres won’t be changed at all.
Does that mean if, somehow, I could manually tag genres on all of movies correctly for regular linear regression, the collaborative filtering’s efficiency would be the same as regular linear regression’s?

Let’s not go too far first, and let me ask a few questions: if we go for the regular linear regression approach as you say, given that we have 1000 users and 200 movies, how many models are we going to train? Are we going to train one model for one user? If user A and B both rated movie M, under the regular linear regression approach, would user A has any effect to the model that describes user B? Would there be any “collaboration” under the regular LR approach? When would “collaboration” be a good thing?

I see what you mean here. The regular LR approach has the equation as [1user x n_features] X [n_features x n_movies ]. Technically, only 1 user per training. So if we use collaborative filtering, the equation will be [n_users x n_features] X [n_features x n_movies] and using gradient descent will create collaborative effect between parameters, in this case are users, movies, and features. Is that what you mean?
The last question: if what I was saying above is correct. Then, it would be a good thing when we need to match/recommend 2 or more users that have the same interest or close-enough w matrices. Both LR and collaborative filtering approach could create clustering effect but collaborative filtering is more reliable and less subjective.

Exactly. All users’ and movies’ parameters are trained under one cost function, so that user A and user B affect each other through their common movie C, and similarly, movie D and movie E affect each other through their common user F.

If they had been trained under different cost functions, meaning that we had one model for one user (or as you said: [1user x n_features] X [n_features x n_movies ]), then one immediate effect is a movie M would have different feature values among all user’s models. More precisely, a movie has a different base of feature vector in each user’s model. Consequently, it’s impossible for us to compare two users or to compare two movies.

Yes.

Collaborative filtering is less subjective because something about a movie is contributed by many users. It is also more useful because it produces weights that allow us to make user-user, user-movie, and movie-movie comparisons. It is one collaborative world instead of N user worlds.