C3_W2_RecSysNN_Assignment - pprint_train() returns duplicates for userid

Note: the assignment has been updated so that the following answer is no longer relevant. Please jump to this post for the latest explanation for why userid is duplicated in the user data table.

Hello Stephen, I think we can attack a data problem in many different ways, and the way this assignment adopt is to “expand” user-movie ratings into several rows. For example, if a user rated 3 movies and each movie has 3 genres associated, then you will end up seeing 1 (user) x 3 (movies) x 3 (genres) = 9 rows of data in both the user and the item data tables, and because all these 9 rows belong to the same user, you will see 9 identical user rows in the user table, and in the corresponding item table rows, you will see 3 movie ids repeated 3 times each but among the rows for one movie id, different genres are selected.

The assignment should have used the trained model for making suggestions, after the line in your last screenshot, so you can check out how the assignment uses the result :wink:

3 Likes