W4 A1 | Transformers: scaled dot assessment

Hi everybody.

I feel very stupid (and I am sure I have a bug in the notebook).

My printed “output” after the scaled dot assement cell:

tf.Tensor(
[[0.2589478 0.42693272 0.15705977 0.15705977]
[0.2772748 0.2772748 0.2772748 0.16817567]
[0.33620113 0.33620113 0.12368149 0.2039163 ]], shape=(3, 4), dtype=float32)

assessment comments:

AssertionError Traceback (most recent call last)
in
1 # UNIT TEST
----> 2 scaled_dot_product_attention_test(scaled_dot_product_attention)

~/work/W4A1/public_tests.py in scaled_dot_product_attention_test(target)
60 assert np.allclose(weights, [[0.2589478, 0.42693272, 0.15705977, 0.15705977],
61 [0.2772748, 0.2772748, 0.2772748, 0.16817567],
—> 62 [0.33620113, 0.33620113, 0.12368149, 0.2039163 ]])
63
64 assert tf.is_tensor(attention), “Output must be a tensor”

AssertionError:


Any hints on how to debug from here would be appreciated.

Edited: Noticed the missing “,” in my output. Will try to debug.

Hi everyone,
I have exactly the same error: “assert tf.is_tensor(attention), “Output must be a tensor””
I double checked my output with
print(tf.is_tensor(output),tf.is_tensor(attention_weights))
and it is showing that the output is a tensor. However, I’m still getting the error message.
Because of this error when I submit my code I got 0 points because of another error:
“Cell #13. Can’t compile the student’s code. Error: AssertionError()”

Can anyone help with that?

Hi, Jan Kieres.

If that’s the case, then try restarting the kernel by following the correct method and then check the exact error and share it with us. Thanks!

Hi, everyone.
I got the same error. I tried restart the kernel but it didn’t work. Have u got any solution?
Can anyone help us on this ? :pleading_face:

I miscalculated with dk = k.ndim, it should be seq_len_k

I am getting the same error: assert tf.is_tensor(attention), “Output must be a tensor”

I’m using seq_len_k as you suggested above:
scaled_attention_logits = tf.math.divide(matmul_qk, tf.math.sqrt(seq_len_k*1.0))

I validated that I am outputting tensors:

print(f" output={type(output)}  {output.shape}")
print(f" attention_weights={type(attention_weights)}  {attention_weights.shape}")

# END CODE HERE

return output, attention_weights

output=<class ‘tensorflow.python.framework.ops.EagerTensor’> (3, 2)
attention_weights=<class ‘tensorflow.python.framework.ops.EagerTensor’> (3, 4)

please help

Hello all!

Welcome to the community.

Please go through this link on how to resolve the issues related to scaled dot product attention.