DLS course 5 w4 scaled dot product attention

adamuag · August 20, 2021, 12:06pm

hi, I am having issues with final exercise w4. I am getting the following error and the grader output is not given much details.

AssertionError Traceback (most recent call last)
in
1 # UNIT TEST
----> 2 scaled_dot_product_attention_test(scaled_dot_product_attention)

~/work/W4A1/public_tests.py in scaled_dot_product_attention_test(target)
60 assert np.allclose(weights, [[0.2589478, 0.42693272, 0.15705977, 0.15705977],
61 [0.2772748, 0.2772748, 0.2772748, 0.16817567],
—> 62 [0.33620113, 0.33620113, 0.12368149, 0.2039163 ]])
63
64 assert tf.is_tensor(attention), “Output must be a tensor”

AssertionError:

I printing both output and attention weight variables and both are tf.tensors.

tf.Tensor(
[[0.7464066 0.23822893]
[0.7461551 0.23846523]
[0.7383507 0.24579675]], shape=(3, 2), dtype=float32)
tf.Tensor(
[[0.2535934 0.26994875 0.23822893 0.23822893]
[0.25384492 0.25384492 0.25384492 0.23846523]
[0.26164928 0.26164928 0.23090468 0.24579675]], shape=(3, 4), dtype=float32)

have you an idea what could be wrong?

this is my implementation:

matmul_qk = tf.linalg.matmul(q, k, transpose_b=True) # (…, seq_len_q, seq_len_k)

dk = k.shape[-2]
scaled_attention_logits = tf.divide(matmul_qk,dk**2)

if mask is not None: # Don't replace this None
    scaled_attention_logits += (1.0-mask)*-1e9 

attention_weights = tf.keras.activations.softmax(scaled_attention_logits) 
output = tf.linalg.matmul(attention_weights,v)

jonaslalin · August 20, 2021, 12:16pm

Can you post your implementation of scaled attention?

jonaslalin · August 20, 2021, 12:47pm

You have a few mistakes. Read the instructions very carefully again. For example, d_k is the embedding dimension of each head, i.e., the last dimension of the key (d_model / h). The scaling is done using square ROOT, not d_k squared.

TMosh · August 20, 2021, 4:21pm

Note that the assert was from line 62, for having the wrong values. Not due to the values being tensors, that was from line 64.

Topic		Replies	Views
C5 W4 A1:Scaled dot product attention error Sequence Models coursera-platform	5	856	March 25, 2023
DLS Course 5 W4 A1 Exercise 3 Sequence Models coursera-platform	2	586	March 15, 2022
W4, scaled_dot_product_attenion, Output must be a tensor Sequence Models coursera-platform	2	472	August 3, 2023
W4 A1 \| Transformers: scaled dot assessment Sequence Models coursera-platform	6	1005	October 11, 2022
W4A1E3 what's wrong? Sequence Models coursera-platform	1	609	July 25, 2021

DLS course 5 w4 scaled dot product attention

hi, I am having issues with final exercise w4. I am getting the following error and the grader output is not given much details.

Related topics