Why is simple matmul of embedding vectors describes theirs similarity?

The principle is what I wrote above, its like cosine similarity and measuring distance between representations of words in embeddings, as well as finding where the attentions is placed on the sentence.

Have a look on this post here in the forum:

1 Like