Confusion regarding recommender systems - user & movie example

I’ve watched it a few times but I’m still kind of confused about the features used in the Collaborative Filtering algorithm.

Do I think of it like this: users have their own weights which essentially are their tastes encoded. Movies have their own features, which are essentially their descriptions encoded. Instead of previous linear regression models, where we learn how to reach the provided labels by ‘discovering’ the pattern via weights, we learn how to reach the provided rating by ‘discovering’ the user’s taste AND the movie’s description via weights. It’s pretty much the same thing, but we now assign each user their own weight, similar to how we assign each unit in a neural network their own weights.

My biggest problem is wrapping my head around the movies themselves. “Love at Last” and “Car Chase Man” are not actually part of the dataset, right? We are pretty much just ‘discovering’ what the movies are about given the partially complete user rating matrix, when we work backwards to find the values of x^(1) and x^(2), as well as the weights w_j^(1) and w_j^(2)?

I apologize if this isn’t clear, it’s kind of hard to write down how I’m confused. Please ask questions if you require further explanation, or just let me know if my line of thinking is correct. Thanks!

Lecture referenced: Collaborative Filtering algorithm

Yes.

The examples Andrew gives are just to give an intuitive feel for what the algorithm is doing.

1 Like

Oh wait this is just Sudoku!!!

1 Like

To me, your 2nd paragraph was making comparison between linear regression (where w is trained and x is provided) and collaborative filtering (where w and x are both trained). The comparison is fine to me until the above quoted line because it mentioned neural network which, to me, seems not absolutely relevant to the comparison. What do you think?

If you were referring to the following slides, all of the movies there are in the dataset. Each rating represents that the corresponding movie is in the dataset.

Yes, but both what the movies and the users are about, instead of just the movies.

That’s an interesting association. I am not sure if the below will help, but please skip it if not. It is the matrix representation of the maths in the lecture.

Cheers,
Raymond

1 Like

Yes I understand it much better now after thinking about this for the last day. That thing I said about neural networks was very off, ignore that.

I realize now that this algorithm is working backwards with two different data matrices but which both map to a rating, so the algorithm essentially combines the two to be able to work backwards and find what the weights of the users and the features of the movies are.

Thank you for the math, that definitely helps paint a more clear picture. What confused me was when Andrew had added movie features next to the users, I somehow (incorrectly) thought there was still only 1 matrix. I appreciate the help!

1 Like

You are welcome, Ryan. It’s great to hear from you and about your progress! :smiley:

Cheers,
Raymond