Dot product vs squared distance

abhilash341 · January 24, 2025, 4:36am

I’m curious why we use the dot product to determine compatibility between a user and a movie, while squared distance is used to measure similarity between two movies, even though both user and movie embeddings from the neural network have the same dimensions.

Alireza_Saei · January 24, 2025, 6:57am

Hi @abhilash341

The choice between dot product and squared distance depends on the goal:

The dot product measures compatibility or alignment, and it is ideal for user-movie matching because it captures how well the user’s preferences align with the movie’s features.

On the other hand, squared distance measures dissimilarity, which is useful for comparing two movies because it quantifies how far apart their embeddings are in the feature space.

Hope it helps! feel free to ask if you need further assistance.

rmwkwok · January 24, 2025, 1:25pm

Hello @abhilash341,

Considering the following 2-dimensional case, we can see that they are actually related:

And clearly the relation is much simpler if the embeddings are normalized which is how we treat them in the C3 W2 assignment 2 as I quoted below:

(From exercise 1)

So, as long as we normalize the embeddings, squared distance and dot product are not quite different, so to me, using one or another is a matter of choice. However, I believe people more prefer dot product of two normalized embeddings sometimes because we remember it has a bounded range of -1 to 1. Having a bounded range is good for deciding how to scale the labels for training data. Of course, with the above simple maths, we also know that squared difference of two normalized embeddings is also bounded, but I guess perhaps that may be less intuitive?

@abhilash341, I don’t know which course material you are referring to, so I cannot be more specific. If we want to measure similiarity, then we can use dot-product because larger value means more similar. Conversely, squared distance is better for measuring dissimilarity. However, whether to measure similarity or dissimilarity, to me, it is really our preference.

Cheers,
Raymond

abhilash341 · January 24, 2025, 3:19pm

This question relates to Practice Lab 2 from Week 2 of the unsupervised learning course.

If a and b are unit vectors, then a12+a22 should equal 2, correct?
Given that, the squared difference would be 2⋅(2−a⋅b), right?

So, the conclusion appears to be that after normalization, the squared difference also depends entirely on a⋅b, with the other terms remaining constant. Thanks for the excellent explanation, as always, Raymond!

rmwkwok · January 25, 2025, 2:19am

Hello, @abhilash341!

I completely agree with your conclusion, and as for your following question:

By definition, an unit vector has a length 1

As for why the assignment chose squared distance for finding similar items and it chose dot-product for neural network, again, I believe there is no reason that it must be arranged that way. Even if we use dot-product for everything, it will work just fine. However, this is a course and we are learners, so, to me, it will not be a bad idea for us to come across more techniques. I think the best outcome of such arrangment would be having learners like us who will ask “why such arrangment”.

Cheers,
Raymond

rmwkwok · January 25, 2025, 3:27am

Hello @abhilash341,

I rewrote the steps in this reply with better (and correct) variable names.

Topic		Replies	Views
Dot product vs. cosine distance between embedding vectors LangChain: Chat with Your Data	2	276	July 13, 2023
Double Question: L2 norm-ing item and user vectors, and similar movies using \|\|v_k - v_j\|\|^2 Unsupervised Learning, Recommenders, Reinforcement week-2	3	254	December 18, 2023
Content-based filtering \| find similar movie Unsupervised Learning, Recommenders, Reinforcement week-2	4	34	September 12, 2024
C3_W2 content-based filtering, dot product comment on slides Unsupervised Learning, Recommenders, Reinforcement week-2	2	247	March 1, 2024
Calculating sq_dist for Course 3 Week 2 MLS Resources	0	253	December 5, 2022

Dot product vs squared distance

Related topics