Week 4 of sequence models course

When I was watching the self attention video in the transformers section , I just got a doubt that how do we calculate query(q), key(k) and value(v) for each word ?

They are learned when the embedding is created.

1 Like