C5_W4_A1 scaled_dot_product_attention

Hi, I am stuck on it. Followed all previous notes.

[[0.3071959 0.5064804 0. 0.18632373]
[0.38365173 0.38365173 0. 0.23269653]
[0.38365173 0.38365173 0. 0.23269653]]], shape=(3, 3, 4), dtype=float32)

and the error message

AssertionError Traceback (most recent call last)
in
1 # UNIT TEST
----> 2 scaled_dot_product_attention_test(scaled_dot_product_attention)

~/work/W4A1/public_tests.py in scaled_dot_product_attention_test(target)
73 assert np.allclose(weights, [[0.30719590187072754, 0.5064803957939148, 0.0, 0.18632373213768005],
74 [0.3836517333984375, 0.3836517333984375, 0.0, 0.2326965481042862],
—> 75 [0.3836517333984375, 0.3836517333984375, 0.0, 0.2326965481042862]]), “Wrong masked weights”
76 assert np.allclose(attention, [[0.6928040981292725, 0.18632373213768005],
77 [0.6163482666015625, 0.2326965481042862],

AssertionError: Wrong masked weights

my numbers are shorter - i get less digits… Any one has an idea?
Thx

Hello @Yona_Hollander,

Printing a Tensor does not give us all the digits on screen. We can’t yet tell if your numbers are actually shorter. However, I found that your weight has a shape of (3, 3, 4) whereas the correct shape should be (1, 3, 4). It is interesting because normally the test should complain about the shape first.

image

I suggest you to first figure out why the shape is (3, 3, 4) instead of (1, 3, 4). If you have no idea where the problem is, you may add some prints to show the shapes of each of the q, k, v, mask, then check with the following formula the expected shapes of all intermediate variables along your solution, and compare the expected shapes with the code’s computed shapes.

image

Good luck,
Raymond