C3W2_Content-Based Filtering

In the C3_W2_RecSysNN_Assignment, the data was normalized before splitting into training and testing sets. Isn’t this a form of data leakage? Shouldn’t normalization be done after splitting, using only the training data?

1 Like

Hello, @maab,

Welcome to this community!

Yes, I am with you on this that those scalars should be fitted with training data only, as the testing data is not supposed to be known at this stage. I will share your post with the course team for their review.

Cheers,
Raymond

1 Like

Thanks for the response!

1 Like