C5_W4_A1_UNQ_C6 Decoder Layer

Hello, wondering if i am missing some format or syntax subtlety in Decoder layer in call to self.mha2

# BLOCK 2
        # calculate self-attention using the Q from the first block and K and V from the encoder output. 
        # Dropout will be applied during training
        # Return attention scores as attn_weights_block2 (~1 line) 
        mult_attn_out2, attn_weights_block2 = self.mha2(Q1, enc_output, enc_output, padding_mask, return_attention_scores=True)  # (batch_size, target_seq_len, d_model)
        --------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-20-7c54c129d8b0> in <module>
      1 # UNIT TEST
----> 2 DecoderLayer_test(DecoderLayer, create_look_ahead_mask)

~/work/W4A1/public_tests.py in DecoderLayer_test(target, create_look_ahead_mask)
    180     assert np.allclose(attn_w_b1[0, 0, 1], [0.5271505,  0.47284946, 0.], atol=1e-2), "Wrong values in attn_w_b1. Check the call to self.mha1"
    181     assert np.allclose(attn_w_b2[0, 0, 1], [0.32048798, 0.390301, 0.28921106]),  "Wrong values in attn_w_b2. Check the call to self.mha2"
--> 182     assert np.allclose(out[0, 0], [-0.22109576, -1.5455486, 0.852692, 0.9139523]), "Wrong values in out"
    183 
    184 

AssertionError: Wrong values in out

Has this issue been resolved?

Not yet , I am looking for tips to self.mha2 in exercise Decoder …
Thanks for any help to the error included in the post.

When I tested your notebook earlier, after fixing the problem in the self.mha() constructor, I did have to fix a typo in DecoderLayer(), at self.mha2(…). The “look_ahead_mask” in that line was missing the last ‘k’.

In positional_encoding(), for “angle_rads = …”, you’re supposed to use your get_angles() function.

In DecoderLayer(), you’re calling self.mha2(…) with Q1, x, and x, and look_ahead_mask. That’s incorrect. The instructions say this:
...using the Q from the first block and K and V from the encoder output.

So that should be Q1, enc_output, enc_output, and use padding_mask instead of look_ahead_mask.

In self.layernorm2(…), you should use mult_attn_out2 and Q1.

The instructions for self.layernorm3(…) say:
...to the sum of the ffn output and the output of the second block
But you did not include adding the output of the second block.

4 Likes

Thanks for your last two tips (the others I had already taken care of) – I was able to complete the assignment, thanks again for your patience Tom Mosher