C5W4 exercise 6 DecoderLayer call method

Hello. I have managed to implement the calculation of the attention output (mult_attn_out2) of the second MHA block successfully, at least they pass the unit tests, but the values in the output of the feed forward network are wrong according to the final unit test. So something is going wrong with the add+norm layer of the second MHA block, the FFN layer, the dropout layer, the final add+norm layer, or some combination of all these. But I cannot find the problem. My code is attached below, hopefully someone can see what I cannot.

DecoderLayer call method code

mentor edit: code removed

Error Message


AssertionError Traceback (most recent call last)
in
1 # UNIT TEST
----> 2 DecoderLayer_test(DecoderLayer, create_look_ahead_mask)

~/work/W4A1/public_tests.py in DecoderLayer_test(target, create_look_ahead_mask)
180 assert np.allclose(attn_w_b1[0, 0, 1], [0.5271505, 0.47284946, 0.], atol=1e-2), “Wrong values in attn_w_b1. Check the call to self.mha1”
181 assert np.allclose(attn_w_b2[0, 0, 1], [0.32048798, 0.390301, 0.28921106]), “Wrong values in attn_w_b2. Check the call to self.mha2”
→ 182 assert np.allclose(out[0, 0], [-0.22109576, -1.5455486, 0.852692, 0.9139523]), “Wrong values in out”
183
184

AssertionError: Wrong values in out

The “output of the first block” is Q1, not mult_attn_out1.
Also, the only layer that needs “training = training” is the dropout_ffn() layer.

2 Likes

That sorted it, thanks. I knew it was something stupidly simple.