The principle is what I wrote above, its like cosine similarity and measuring distance between representations of words in embeddings, as well as finding where the attentions is placed on the sentence.
Have a look on this post here in the forum:
The principle is what I wrote above, its like cosine similarity and measuring distance between representations of words in embeddings, as well as finding where the attentions is placed on the sentence.
Have a look on this post here in the forum: