I have a question about the content based recommandation
We have x_u and x_m that are the user information and the movie information and
we want to optimize a certain cost function that use the dot product of v_u and v_m which
are the output of the 2 neural network we have built that accept as input resp. x_u and x_m.
Can we simply make 1 neural network that have as input a vector x_um that is the concatenation
of both x_u and x_m if not why ?
Thanks in advance !
Welcome to the community!
It would be complex when you doing preprocessing as we know the output of the movies data, and output of the user data, but if from start we get x_um that is the concatenation
of both x_u and x_m what would be the output to train the model, it would take more, and more data preprocessing to get the accurate data that show that the features of all the original data sets, but also if we doing that there are difference between them as :
The image above show 2 things :
- First part of the NN(User Content) it’s concerned only on the part of users for example how many movies that user is rated and what is the age of this user, in addition to many thing also like what how long the new user is stay in site of movies, and so on
- Second part of NN(Movie/item content) is concerned with the movies content(This movie follows the category of what also what age is this movie targeting), the rate of this move, and so on
After that we take the dot product of these output to predict the user will watch this movie and what the rate will the user give to the move
Here we in the first part of NN extract, and elicit important information from the user (by complex calculation), and also the same thing to could get the best prediction
Note if we make 1 neural network that have as input a vector x_um that is the concatenation
of both x_u and x_m, there will be no difference between this neural network and any other neural network
thanks for your fast reply.
So If I have well understood here the problem we really have to consider the movie data and the user data as 2 separated entities right ? (and that why we cannot concatenate them), because considering them as a concatenation would imply a certain relation between the data ? (that why we have to make 2 NN ?)
If we have enough data, maybe the NN will work ? But the model will not be able to well generalize with both of these data are related, right (so we are helping the model)?
yes, also the base of the Content base model is the original NN which is we have the movies and we want to predict the rate of this movie , or we have the users and we want to predict that film will liked by this user or not(classification problem), so to make the model easier we take the output of each model and take the dot product between them to predict the rate of this user if he watch this movie, so we use the easier way is just concatenate the output of this 2 NN, and take the dot product.
Note you may meet Content base model with 1 NN, but personally I didn’t meet it