Range of cosine similarity: between 0 and 1

ajeancharles · July 18, 2023, 8:31pm

Lecture says,

What is to stop from getting a range between 1 and -1?

Juan_Olano · July 18, 2023, 9:26pm

Nothing stops it from getting a range between 1 and -1. In fact, cosine distance has all that range.

As for why the summary indicates that the value goes between 0 and 1? I can venture an answer:

When the C.S. = 1 the vectors are identical. This is one of the limits
When the C.S. = 0 the vectors are orthogonal (no match). This is like one of the limits
When the C.S. = -1 they are pointing in opposite directions - I would argue that in this case the similarity is out of the question.

It’s a weak answer but the only I can think of.

In summary, between 0 and 1 we are in the range of similarities.

Again, this is me venturing an answer. Lets hope for a better answer from someone more knowledgable.

ajeancharles · July 18, 2023, 9:29pm

Thank you! Something to think about.

Juan_Olano · July 18, 2023, 11:24pm

Definitively! You got me thinking right here! Hopefully someone jumps in with more information!

rmwkwok · July 18, 2023, 11:33pm

Hello @ajeancharles,

are the vectors’ components constrained to non-negative values only?

Raymond

Juan_Olano · July 19, 2023, 1:23am

If we are talking about embeddings, the vectors are not constrained to non-negative values only. So it is technically possible to have -1 as the cosine similarity.

Now, in Word2Vec, GloVe and other similar word embedding, it is rare to have these cases, and may be that’s why it is said that the range is 0-1.

rmwkwok · July 19, 2023, 1:39am

Yes… but I think we will need @ajeancharles to investigate into the question and provide the context that can explain the range.

Non-negative components is one possible reason.

A term frequency vector contains only non-negative components.

A rescaled cosine similarity can also be a possibility.

Raymond

ajeancharles · July 19, 2023, 4:18am

Thanks! I have to think about this.

Elemento · July 19, 2023, 5:34am

Hey @ajeancharles,

It is in fact the intended range of cosine similarity. However, you will find that for the example discussed in the lecture video, the vectors represent the frequencies of 2 distinct words, i.e., the vectors have non-negative components only, as @rmwkwok pointed it out. Therefore, Younes stated that the cosine similarity ranges between 0 and 1. Let me quote an excerpt from the video for your reference:

Remember that’s in this example, you have a vector space where the representations of the corpora is given by the number of occurrences of the words disease and eggs.

Also, when Younes stated the range of cosine similarity, he ensured to state the following:

for the vector spaces you’ve seen so far, the cosine similarity takes values between 0 and 1.

Nonetheless, I will raise an issue regarding this, to perhaps add some pop-up, indicating that the actual range of cosine similarity is [-1, 1]. Thanks a lot for creating this thread.

Cheers,
Elemento

rmwkwok · July 19, 2023, 7:37am

Hi @Elemento,

My two cents.

I think @Juan_Olano’s examples very well illustrate that, ultimately, if we want to remember one range, -1 to 1 should be the one to remember.

I think it’s worthwhile to first explain that the non-negativity is a special case because we construct our word vectors on term frequency, and it is such special case that the range is limited to 0 to 1 because the angle between any two vectors is always between 0 to 90 degrees. Graphically , the vectors are confined to the first quadrant, and so the angles are always less than or equal to 90 degree. Then, as we progress to learn word vectors with neural networks (when?), our word vectors’ components can take up any value, which makes it possible for the angle to be between 0 to 180 degrees. Graphically, our vectors can point in any direction, making the angle between two vectors to be able to exceed 90 degrees and comes the negative cosine similarity.

In this way it should make it a smoother transition from the lecture’s special case to the more general case?

I am not sure if the version of my course is still up-to-date, however, there is a reading item after the video for Cosine Similarity. It might be a good place for a longer explaination. That reading item actually re-emphasized the non-negativity and the range of the angle, therefore it might be good to expand from there?

Cheers,
Raymond

ajeancharles · July 19, 2023, 8:34am

Thanks, I will do some reading.

arvyzukai · July 19, 2023, 4:24pm

Hi @Juan_Olano

I think you have typo here - 0 → 1.

Just for future readers.

Cheers

Juan_Olano · July 19, 2023, 6:56pm

Thanks! Fixed! Appreciate the heads up.

ajeancharles · July 19, 2023, 7:33pm

The pleasure is all mine!

Elemento · July 20, 2023, 2:19pm

Hey @rmwkwok,

Although the reading item don’t emphasize on the negativity of the vectors yet, but that’s indeed a good suggestion. Let me pass on your suggestion to Mubsi.

Cheers,
Elemento

Topic		Replies	Views
Cosine Similarity Vector Databases: from Embeddings to Applications	4	125	April 8, 2024
Question about Embeddings and Cosine Similarity NLP with Classification and Vector Spaces week-3	1	314	February 7, 2023
C5W2A1 Equalize function - outputing Nan value Sequence Models	4	392	August 20, 2023
C3W4 - Something sounds wrong with the similarity of vectors NLP with Sequence Models week-4	4	543	July 16, 2022
Week 2 - cosine similarity Sequence Models	6	546	May 18, 2023

Range of cosine similarity: between 0 and 1

Related topics