C5_W4_A1_UNQ_C6 Decoder Layer

Hello, wondering if i am missing some format or syntax subtlety in Decoder layer in call to self.mha2

        # calculate self-attention using the Q from the first block and K and V from the encoder output. 
        # Dropout will be applied during training
        # Return attention scores as attn_weights_block2 (~1 line) 
        mult_attn_out2, attn_weights_block2 = self.mha2(Q1, enc_output, enc_output, padding_mask, return_attention_scores=True)  # (batch_size, target_seq_len, d_model)
AssertionError                            Traceback (most recent call last)
<ipython-input-20-7c54c129d8b0> in <module>
      1 # UNIT TEST
----> 2 DecoderLayer_test(DecoderLayer, create_look_ahead_mask)

~/work/W4A1/public_tests.py in DecoderLayer_test(target, create_look_ahead_mask)
    180     assert np.allclose(attn_w_b1[0, 0, 1], [0.5271505,  0.47284946, 0.], atol=1e-2), "Wrong values in attn_w_b1. Check the call to self.mha1"
    181     assert np.allclose(attn_w_b2[0, 0, 1], [0.32048798, 0.390301, 0.28921106]),  "Wrong values in attn_w_b2. Check the call to self.mha2"
--> 182     assert np.allclose(out[0, 0], [-0.22109576, -1.5455486, 0.852692, 0.9139523]), "Wrong values in out"

AssertionError: Wrong values in out

Has this issue been resolved?

Not yet , I am looking for tips to self.mha2 in exercise Decoder …
Thanks for any help to the error included in the post.

When I tested your notebook earlier, after fixing the problem in the self.mha() constructor, I did have to fix a typo in DecoderLayer(), at self.mha2(…). The “look_ahead_mask” in that line was missing the last ‘k’.

In positional_encoding(), for “angle_rads = …”, you’re supposed to use your get_angles() function.

In DecoderLayer(), you’re calling self.mha2(…) with Q1, x, and x, and look_ahead_mask. That’s incorrect. The instructions say this:
...using the Q from the first block and K and V from the encoder output.

So that should be Q1, enc_output, enc_output, and use padding_mask instead of look_ahead_mask.

In self.layernorm2(…), you should use mult_attn_out2 and Q1.

The instructions for self.layernorm3(…) say:
...to the sum of the ffn output and the output of the second block
But you did not include adding the output of the second block.


Thanks for your last two tips (the others I had already taken care of) – I was able to complete the assignment, thanks again for your patience Tom Mosher