Dot product vs squared distance

rmwkwok · January 24, 2025, 1:25pm

Considering the following 2-dimensional case, we can see that they are actually related:

And clearly the relation is much simpler if the embeddings are normalized which is how we treat them in the C3 W2 assignment 2 as I quoted below:

(From exercise 1)

So, as long as we normalize the embeddings, squared distance and dot product are not quite different, so to me, using one or another is a matter of choice. However, I believe people more prefer dot product of two normalized embeddings sometimes because we remember it has a bounded range of -1 to 1. Having a bounded range is good for deciding how to scale the labels for training data. Of course, with the above simple maths, we also know that squared difference of two normalized embeddings is also bounded, but I guess perhaps that may be less intuitive?

@abhilash341, I don’t know which course material you are referring to, so I cannot be more specific. If we want to measure similiarity, then we can use dot-product because larger value means more similar. Conversely, squared distance is better for measuring dissimilarity. However, whether to measure similarity or dissimilarity, to me, it is really our preference.

Cheers,
Raymond

Topic		Replies	Views
Dot product vs. cosine distance between embedding vectors LangChain: Chat with Your Data	2	252	July 13, 2023
Double Question: L2 norm-ing item and user vectors, and similar movies using \|\|v_k - v_j\|\|^2 Unsupervised Learning, Recommenders, Reinforcement week-2	3	254	December 18, 2023
Content-based filtering \| find similar movie Unsupervised Learning, Recommenders, Reinforcement week-2	4	32	September 12, 2024
C3_W2 content-based filtering, dot product comment on slides Unsupervised Learning, Recommenders, Reinforcement week-2	2	244	March 1, 2024
Calculating sq_dist for Course 3 Week 2 MLS Resources	0	244	December 5, 2022

Dot product vs squared distance

Related topics