C5_W4_A1 scaled_dot_product_attention mask issue


I get the following error when running scaled_dot_product_attention_test(scaled_dot_product_attention)

AssertionError Traceback (most recent call last)
----> 2 scaled_dot_product_attention_test(scaled_dot_product_attention)

~/work/W4A1/public_tests.py in scaled_dot_product_attention_test(target)
60 assert np.allclose(weights, [[0.2589478, 0.42693272, 0.15705977, 0.15705977],
61 [0.2772748, 0.2772748, 0.2772748, 0.16817567],
—> 62 [0.33620113, 0.33620113, 0.12368149, 0.2039163 ]])
64 assert tf.is_tensor(attention), “Output must be a tensor”

I am starting with tf.matlmul, computing dk from shape of k, adding the 1-mask *-1e9 to scaled_attention_logits, finishing with tf.matmul

I have looked at all other related threads and I don’t spot an error from there.

When doing print(mask) in the scaled_dot_product_attention just before “if mask is not None”, the output is None, so the line where we multiply by -1.0e9 is not even used. Could it be the error ? Is it normal that mask is set to None ?

Thank you for your help, I’m clueless here.

If you are applying softmax to scaled_attention_logits before the final dot product, please click my name and message your notebook as an attachment.

Here’s a hint:
The 1st dot product is of form Q . K^{T}

Thank you so much, I could complete the assignment ! It was really silly not noticing this. I was confused not knowing the details of this scaled_dot_product_attention_test function.