Can we use embedding layer on content-based recommending system?

So I know that embedding layers are sued to represent data in n dimensional vector space and in the content based we use v_u and v_m vector of each same length to fulfil the dot product operation.

Does using EL makes sense here?

Hi @tbhaxor great question!

Yes, using embedding layers makes sense in this context. Embedding layers are commonly used to represent data in a vector space where each dimension represents a specific feature. In this case, the features could be attributes such as the genre, director, or actors of a movie.

By representing each movie as a vector of the same length, we can use dot products to compute similarities between movies and recommend similar items to a user.

So, it does make sense and it’s actually common practice

Please let me know if this answer your question

1 Like

So why not using in the course assignment, to keep it simple?

There are different approaches, the course is focus on building the basic knowledge in most areas, so perhaps try to keep things simple while explaining the basic concepts was one of the main reason, in many courses prioritization plays a major role in deciding which things will be included and what things won’t.

But taking your learning experience further and pick up things on your own is a great learning opportunity.

Yeah makes sense this is introductory course. Does Deeplearning AI has any course on recommender systems?

@pastorsoto One question.

Why do we need a vector representation of each user and movie. Why cant we have single real number and simply we can to c_u^j \cdot c_m^i, where c_u^j, c_m^i \in \mathbb{R}

I think using a single real number to represent each user and movie may not be sufficient to capture the complexity and diversity of the user preferences and movie characteristics.

Movies can have multiple attributes such as genre, director, actors, release year, etc. Each of these attributes can contribute differently to a user’s preference for a movie. Similarly, a user’s preferences can be based on different factors such as genre, language, ratings, etc. Therefore, a single number cannot capture all these factors and may result in a loss of information.

On the other hand, using a vector representation can provide a more comprehensive and expressive representation of the user and movie preferences. Each element in the vector can correspond to a specific attribute or factor, allowing the model to capture the nuances and complexities of the user and movie preferences. Furthermore, the dot product of two vectors can provide a measure of similarity between the two representations, which can be used to make recommendations.

So I think using vectors really provides an advantage in those representations.

You’re correct this is an introductory course so some stuff might not be cover but there are courses that might help you as well.

If you have more question feel free to open a new issue, I will be happy to support you in any queries you have.

How would you interpret -ve, 0, +ve dot product?

What i think is +ve, more likely, -ve opposite (like two poles of magnet), 0 neutral / undefined

This makes sense because the more components for each quantity we have, precisely we can identify it because the likelihood of two objects sharing the same set of identifiers decreases.

This is how I think

For example, consider identifying people based on their name alone. If two people have the same name, it can be difficult to differentiate between them. However, if we add additional identifiers such as their birth date, social security number, or passport number, the chances of two people sharing all of those identifiers become much lower, making it easier to differentiate between them.

Yes!! that’s the main thing. I think you got the concept!