C3_W2_Collaborative_RecSys_Assignment | What exactly is X

I want to train a model with the same mechanisms and dataset as this assignment and I don’t understand how matrix X is calculated.

Given that the \mathbf{W} has the shape of (443, 10), I assume each row corresponds to a user, and each column is a corresponding feature to 10 different ratings that should be initialized with a random number and adjusted(learned) later by the network, and given the shape of b (1, 443), I assume each value corresponds to a user and should be initialized randomly and learned by the network later on.

But since the \mathbf{X} should be constant throughout the learning process, I don’t understand how it is derived from the ml-latest-small dataset.

Furthermore, I’d appreciate it if you could give a hint about how to also include the movie categories in the network.

First two rows of each file in the ml-latest-small dataset:

  • movies.csv:

    movieId,title,genres
    1,Toy Story (1995),Adventure|Animation|Children|Comedy|Fantasy
    
  • ratings.csv:

    userId,movieId,rating,timestamp
    1,1,4.0,964982703
    
  • tags.csv:

    userId,movieId,tag,timestamp
    2,60756,funny,1445714994
    

It’s been said in the lectures that the \mathbf{X} is calculated by subtracting the average of all ratings from each user rating for a movie, but that generates a n_m \times n_u matrix like \mathbf{Y}.

On the other hand, \mathbf{Y} = \mathbf{X} \cdot \mathbf{W} + b, but I still can’t understand how to create the \mathbf{X}

X is not constant. The values are learned.

How how is it calculated?

Please review the lectures. Both W and X have gradients. So we can use the gradients to learm the values that minimize the cost.

The key here is that the Y values are a matrix, not just a vector. So that gives another dimension to the labels, so we can learn both X and W.