C5_W4_A1 scaled_dot_product_attention mask issue

gallinet · November 15, 2022, 1:11pm

Hi,

I get the following error when running scaled_dot_product_attention_test(scaled_dot_product_attention)

AssertionError Traceback (most recent call last)
in
1 # UNIT TEST
----> 2 scaled_dot_product_attention_test(scaled_dot_product_attention)

~/work/W4A1/public_tests.py in scaled_dot_product_attention_test(target)
60 assert np.allclose(weights, [[0.2589478, 0.42693272, 0.15705977, 0.15705977],
61 [0.2772748, 0.2772748, 0.2772748, 0.16817567],
—> 62 [0.33620113, 0.33620113, 0.12368149, 0.2039163 ]])
63
64 assert tf.is_tensor(attention), “Output must be a tensor”

I am starting with tf.matlmul, computing dk from shape of k, adding the 1-mask *-1e9 to scaled_attention_logits, finishing with tf.matmul

I have looked at all other related threads and I don’t spot an error from there.

When doing print(mask) in the scaled_dot_product_attention just before “if mask is not None”, the output is None, so the line where we multiply by -1.0e9 is not even used. Could it be the error ? Is it normal that mask is set to None ?

Thank you for your help, I’m clueless here.

balaji.ambresh · November 15, 2022, 1:29pm

If you are applying softmax to scaled_attention_logits before the final dot product, please click my name and message your notebook as an attachment.

balaji.ambresh · November 15, 2022, 1:43pm

Here’s a hint:
The 1st dot product is of form Q . K^{T}

gallinet · November 15, 2022, 1:52pm

Thank you so much, I could complete the assignment ! It was really silly not noticing this. I was confused not knowing the details of this scaled_dot_product_attention_test function.

Topic		Replies	Views
W4 A1 \| Ex-3 \| Scaled Dot Product Attention Sequence Models coursera-platform	27	3223	March 24, 2025
Course 5 week 4 dot product wrong masked weights Sequence Models coursera-platform	13	1207	October 2, 2023
Week 4: scaled_dot_product_attention Sequence Models coursera-platform	3	905	August 5, 2021
C5_W4A1 scaled_dot_product_attention wrong masked values Sequence Models week-module-4 , coursera-platform	3	48	September 19, 2024
Scales dot product attention Sequence Models coursera-platform	2	953	June 18, 2021

C5_W4_A1 scaled_dot_product_attention mask issue

Related topics