Hi - have a question on the implementation in the lab.
Why does each movie have multiple entries in the training set? I get that they are represented as one-hot vectors, but (using the lab’s example) any reason we cannot have movie ID 6874 have 1s for Action, Crime and Thriller in the same line vs. in separate lines?
Also, wouldn’t this decrease the effectiveness of the algo. given we are losing information about the same movie having multiple genres? What am I missing?
Thanks