I’m curious why we use the dot product to determine compatibility between a user and a movie, while squared distance is used to measure similarity between two movies, even though both user and movie embeddings from the neural network have the same dimensions.
Hi @abhilash341
The choice between dot product and squared distance depends on the goal:
The dot product measures compatibility or alignment, and it is ideal for user-movie matching because it captures how well the user’s preferences align with the movie’s features.
On the other hand, squared distance measures dissimilarity, which is useful for comparing two movies because it quantifies how far apart their embeddings are in the feature space.
Hope it helps! feel free to ask if you need further assistance.
Hello @abhilash341,
Considering the following 2-dimensional case, we can see that they are actually related:
And clearly the relation is much simpler if the embeddings are normalized which is how we treat them in the C3 W2 assignment 2 as I quoted below:
(From exercise 1)
So, as long as we normalize the embeddings, squared distance and dot product are not quite different, so to me, using one or another is a matter of choice. However, I believe people more prefer dot product of two normalized embeddings sometimes because we remember it has a bounded range of -1 to 1. Having a bounded range is good for deciding how to scale the labels for training data. Of course, with the above simple maths, we also know that squared difference of two normalized embeddings is also bounded, but I guess perhaps that may be less intuitive?
@abhilash341, I don’t know which course material you are referring to, so I cannot be more specific. If we want to measure similiarity, then we can use dot-product because larger value means more similar. Conversely, squared distance is better for measuring dissimilarity. However, whether to measure similarity or dissimilarity, to me, it is really our preference.
Cheers,
Raymond
This question relates to Practice Lab 2 from Week 2 of the unsupervised learning course.
If a and b are unit vectors, then a12+a22 should equal 2, correct?
Given that, the squared difference would be 2⋅(2−a⋅b), right?
So, the conclusion appears to be that after normalization, the squared difference also depends entirely on a⋅b, with the other terms remaining constant. Thanks for the excellent explanation, as always, Raymond!
Hello, @abhilash341!
I completely agree with your conclusion, and as for your following question:
By definition, an unit vector has a length 1
As for why the assignment chose squared distance for finding similar items and it chose dot-product for neural network, again, I believe there is no reason that it must be arranged that way. Even if we use dot-product for everything, it will work just fine. However, this is a course and we are learners, so, to me, it will not be a bad idea for us to come across more techniques. I think the best outcome of such arrangment would be having learners like us who will ask “why such arrangment”.
Cheers,
Raymond
Hello @abhilash341,
I rewrote the steps in this reply with better (and correct) variable names.