【C3_W2_RecSysNN_Assignment】Why one-hot coding for movie genre?

okleon · August 2, 2022, 2:09pm

Why one-hot coding for movie genre?

For instance, with one one-hot coded movie feature , 39 rows of training data are required to describe user 2#'s rating on 16 movies.Whie it could have been descirbed with only 16 rows if not using one-hot coding(which by the way is also more intuitively comfortable for me)

There must be some benifit and what is that?

vignesh18 · August 2, 2022, 2:15pm

Hello @okleon ,
Welcome to the DeepLearning.AI community.

That’s a good question. You can follow this thread for relevant discussions on your question.

okleon · August 2, 2022, 2:34pm

Hello @vignesh18

thank you for the info.
It looks like I need to improve on searching skills

samuel_varghese · August 9, 2022, 10:12pm

Can I get more clarity on why there are 39 duplicate entries for user 2 when rating 16 movies? I see item_train and user_train both have 58187 rows but still not getting the connection between the two tables.

okleon · August 9, 2022, 11:16pm

It seems that each combination of user x (movie x movie genre) is ONE training sample of 58187 samples.

For instance:user U1 watched movie M1 and M2 ,both movies are of genre G1 and G2, then there are 4 samples:
U1 M1 G1
U1 M1 G2
U1 M2 G1
U1 M2 G4

If you link the two tables together horizontally row by row, each row is different.

samuel_varghese · August 9, 2022, 11:55pm

Thank you! So the rows line up (implicitly, i.e. no foreign key like in sql). Rows 0-38 are user 2’s reviews of the movies in rows 0-38, i.e. movie 6874, 8798 etc with all the permutations of the genres.

Topic		Replies	Views
One-hot versus "n-hot" for movie genres Unsupervised Learning, Recommenders, Reinforcement week-module-2	4	626	September 9, 2022
Content-Based Filtering lab question Unsupervised Learning, Recommenders, Reinforcement week-module-2	2	578	July 30, 2022
C3_W2_RecSysNN_Assignment - pprint_train() returns duplicates for userid Unsupervised Learning, Recommenders, Reinforcement week-module-2	8	703	March 18, 2023
C3_W2_RecSysNN_Assignment dataset questions Unsupervised Learning, Recommenders, Reinforcement week-module-2	9	569	February 27, 2023
C3_W2_RecSysNN_Assignment_Dataset Unsupervised Learning, Recommenders, Reinforcement week-module-3	3	504	December 29, 2022

【C3_W2_RecSysNN_Assignment】Why one-hot coding for movie genre?

Related topics