C5_W4_A1 assignment Exercise 3

Hey all so on exercise 3 scaled_dot_product_attention, I keep getting the Assertion error “Wrong unmasked weights”

I’ve checked to make sure that the right dimension of k is used to make dk, make sure to normalize along the right softmax axis, mask addition should be fine, etc.
Have any other ideas for places to check for errors?

Please click my name and message your notebook as an attachment.

Hint 1:
When computing matmul_qk, pay attention to the right side of the matrix multiplication. tf.matmul has a boolean flag that can help you with this.

Hint 2:
Here’s the shape of k:

k – key shape == (…, seq_len_k, depth)

You want to consider the key dimension and not details such a batch size when calculating dk. Watching the lectures might help you understand this better.

THANK YOU :smiley:! I fixed the k too.

Hi! I’m having a really similar result, and I can’t find my error. k.shape[-1] is the depth,which I am casting to float32 and using tf.sqrt to find the scaling term.

I can’t find the error with my matmul operation, setting t_b to True. Can I have another hint?

Please click my name and message your notebook as an attachment.