Content-Based less precise than Collaborative Filtering?

lczanna · August 2, 2022, 8:03pm

I just finished the week on Recommendation systems, and I cannot wrap my head about something.

Let’s say that we create a movie vector with features for the Content-based filtering.
The features for v_m will be: movie year, movie genre, avg. rating, etc.

By doing this, am I not missing out on the information I had with Collaborative filtering?
Two movies can have the same movie year, movie genre, avg. rating but they could be liked by different types of people.

In collaborative filtering, w_j incorporated this information on what type of people liked the movie.

Should we take w_j (from collaborative filtering) and add it to v_m (content-based), so we do not miss out on any information?

Or have any other insights you might have on it?

thank you in advance!

Luca

rmwkwok · August 3, 2022, 1:13am

Hi Luca,

That’s an interesting idea, certainly you may incorporate any information you think relevant to the movie into the content-based model. However, we should remember that, you can generate information from collaborative filtering only when the movie was viewed/rated by users, so when you have a new movie, you won’t have that information for the content-based model.

Cheers,
Raymond

lczanna · August 3, 2022, 2:14pm

thank you Raymond for your reply!

Good point. For new movies the collaborative filtering will not be able to ‘learn’ features.
So if I incorporate collaborative filtering information into a content-based model, the model might perform worse on new movies or new users.

Very clear reply,

Luca

rmwkwok · August 3, 2022, 2:42pm

You are welcome Luca!

rmwkwok · August 3, 2022, 3:19pm

@lczanna Let me just add one more point. The problem about a new movie / new user in collaborative filtering is called the “Cold-start” problem. Content-based recommendation can fill the gap because usually we use content-based information about the movie (such as genre) or about the user (such as country based on IP address) and that information does not require any user-movie interactions which collaborative filtering always require.

lczanna · August 3, 2022, 7:38pm

Thanks Raymond, I understand.

One more curiosity: for existing movies and existing users, is collaborative filtering likely to perform better than the content-based algorithm?

If yes, would it make sense to train both algorithms, then:

use collaborative filtering to predict the rating of existing movies/existing users
use content-based to predict the rating when either the movie or the user is new
?

Recommendation algorithms is an exciting topic !

rmwkwok · August 4, 2022, 12:23am

Using a different strategy based on data available makes a lot of sense. Moreover, you may also consider to use both for existing users - as you suggested in the first post of this thread. The idea can be, you trained a collaborative filtering to get an user vector, then combining with users’ content information you further get an even longer vector, before you feed the vector to some layers of NN to keep the shape identical with a movie vector, and finally compute the dot product between a user and a movie vector and minimize the dot product - ofcourse what I am saying is just one possibility of merging two pieces of knowledge.

lczanna · August 4, 2022, 9:08am

Very insightful, thank you Raymond!

Topic		Replies	Views
Combining Collaborative Filtering and Content-based Filtering Unsupervised Learning, Recommenders, Reinforcement week-2	1	428	June 19, 2023
Content-based filtering: If i don't use one of the feature on training data, would the accuracy will be wreck? Unsupervised Learning, Recommenders, Reinforcement week-3	5	280	November 29, 2023
Collaborative Filtereing Topic and Lab......(Some Confusions) Unsupervised Learning, Recommenders, Reinforcement	0	262	December 20, 2023
Prediction in Collabrative filtring Unsupervised Learning, Recommenders, Reinforcement week-2	9	537	March 11, 2023
Understanding Collaborative Filtering and Content-Based Algorithms Unsupervised Learning, Recommenders, Reinforcement week-2	7	335	October 31, 2023

Content-Based less precise than Collaborative Filtering?

Related topics