Guys seriously, I finished the lab of recommender systems, that are predicting movie ratings. After I run the cell with predictions I noticed it recommended me movies I actually had no idea about and enjoyed a lot.
No other recommender system achieves that. Whenever I try Netflix, or Google, or Amazon, or ChatGPT, they recommend things I actually dont like. Could anyone explain me why this works and others dont? is it because its a smaller dataset?
I think there are a few possible reasons.
First, as you mentioned, the dataset used in the lab is much smaller and more focused. That can sometimes make recommendations feel more relevant because the model is working with a simpler, cleaner dataset.
Second, while Netflix and Amazon have incredibly sophisticated recommendation systems, they may not have learned enough about your individual preferences yet. Even after many interactions, building an accurate representation of one person’s tastes is challenging, especially if your interests are diverse or change over time.
Finally, every interaction on those platforms becomes a signal. Watching a movie—even if you don’t end up enjoying it—can influence your recommendation profile. The system has to infer whether you watched it because you liked it, were curious, or simply gave it a try, and that ambiguity can affect future recommendations.
Sometimes a simpler model trained on a smaller, cleaner dataset can produce recommendations that feel surprisingly accurate.
Besides @gent.spah’s many good points, we can think about the objectives. The lab model recommends movies watched by users with similar rating history. Commercial systems might prefer movies that favor their business, their KPIs, or safe choices (trending content). I think this is very important.
We can tell the lab our list of favorite and least favourite movies, but we can’t do the same to those systems. Even if the whole list is available on the platform, how they interpret our signals (as Gent mentioned) are unclear to us.
Data quality. The lab uses only curated MovieLens data. People go to the MovieLens website and provide their ratings, so they are not from just anyone. However, commerical systems takes (all kinds of) signals from everyone, which means it may require more sophisticated ways to filter and preprocess the data to achieve the same level of accuracy.