Why not use coef of correlation instead of cosine similarity between embedding vectors

pawarbi · May 8, 2024, 8:45pm

I understand that we are trying to compare the directionality with cosine similairity instead of the magnitude of two vectors, but still practically speaking why not use coeff of correlation ? can someone give me a practical explanation? thx,

paulinpaloalto · May 9, 2024, 3:42am

There are a number of different ways you could compute a similarity metric between two vectors. There are several forms of correlation coefficients, but those are fairly compute intensive. For the most commonly used definition, you need to compute the covariance and the standard deviation of both inputs. Or you could use the Euclidean distance between the two vectors, but that gives you a number between 0 and 2. I have not done any research to see if ML/DL people mention why they chose cosine similarity, but note that it’s very cheap to compute because of this mathematical relationship:

v \cdot w = ||v|| * ||w|| * cos(\theta)

Where \theta is the angle between the two vectors. So the cosine similarity can be computed as:

cos(\theta) = \displaystyle \frac {v \cdot w} {||v|| * ||w||}

But we normalize embedding vectors to have length one, which makes that computation extremely cheap: just a single dot product and GPUs are pretty good at performing those.

Topic		Replies	Views
Dot product vs. cosine distance between embedding vectors LangChain: Chat with Your Data	2	368	July 13, 2023
Why is simple matmul of embedding vectors describes theirs similarity? Embedding Models: From Architecture to Implementat	36	571	August 13, 2024
Question about Embeddings and Cosine Similarity NLP with Classification and Vector Spaces week-module-3	1	332	February 7, 2023
C3_W2 content-based filtering, dot product comment on slides Unsupervised Learning, Recommenders, Reinforcement week-module-2	2	257	March 1, 2024
Nearest neighbor vs cosine similarity NLP with Classification and Vector Spaces week-module-4	1	551	March 25, 2023

Why not use coef of correlation instead of cosine similarity between embedding vectors

Related topics