Week 1 Scaled Dot-Product Attention: Ungraded Lab - rows v columns

In the solution, it says:
assert weights.sum(axis=1)[0] == 1, “Each row in weights must sum to 1”

However, if I sum the weights over axis 1, then I’m checking that the columns add up to one, don’t I?

When creating a post, please add:

  • Week # must be added in the tags option of the post.
  • Link to the classroom item you are referring to:
  • Description (include relevant info but please do not post solution code or your entire notebook)

Hi @raumschiff

The assertion ensures that the attention scores for each query vector are properly normalized. In an attention mechanism, the weights matrix has dimensions [num_queries, num_keys], where each element represents the attention score assigned to a key given a query. For each query, the attention scores across all keys should sum to 1.

Hope it helps!

1 Like

yes, thanks.
not sure what i was thinking:o