C5_W4_A1_UNQ_C6 Decoder Layer

rameshgopalan · August 3, 2021, 7:00pm

Hello, wondering if i am missing some format or syntax subtlety in Decoder layer in call to self.mha2

# BLOCK 2
        # calculate self-attention using the Q from the first block and K and V from the encoder output. 
        # Dropout will be applied during training
        # Return attention scores as attn_weights_block2 (~1 line) 
        mult_attn_out2, attn_weights_block2 = self.mha2(Q1, enc_output, enc_output, padding_mask, return_attention_scores=True)  # (batch_size, target_seq_len, d_model)
        --------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-20-7c54c129d8b0> in <module>
      1 # UNIT TEST
----> 2 DecoderLayer_test(DecoderLayer, create_look_ahead_mask)

~/work/W4A1/public_tests.py in DecoderLayer_test(target, create_look_ahead_mask)
    180     assert np.allclose(attn_w_b1[0, 0, 1], [0.5271505,  0.47284946, 0.], atol=1e-2), "Wrong values in attn_w_b1. Check the call to self.mha1"
    181     assert np.allclose(attn_w_b2[0, 0, 1], [0.32048798, 0.390301, 0.28921106]),  "Wrong values in attn_w_b2. Check the call to self.mha2"
--> 182     assert np.allclose(out[0, 0], [-0.22109576, -1.5455486, 0.852692, 0.9139523]), "Wrong values in out"
    183 
    184 

AssertionError: Wrong values in out

TMosh · August 3, 2021, 8:35pm

Has this issue been resolved?

rameshgopalan · August 3, 2021, 9:07pm

Not yet , I am looking for tips to self.mha2 in exercise Decoder …
Thanks for any help to the error included in the post.

TMosh · August 3, 2021, 11:32pm

When I tested your notebook earlier, after fixing the problem in the self.mha() constructor, I did have to fix a typo in DecoderLayer(), at self.mha2(…). The “look_ahead_mask” in that line was missing the last ‘k’.

In positional_encoding(), for “angle_rads = …”, you’re supposed to use your get_angles() function.

In DecoderLayer(), you’re calling self.mha2(…) with Q1, x, and x, and look_ahead_mask. That’s incorrect. The instructions say this:
...using the Q from the first block and K and V from the encoder output.

So that should be Q1, enc_output, enc_output, and use padding_mask instead of look_ahead_mask.

In self.layernorm2(…), you should use mult_attn_out2 and Q1.

The instructions for self.layernorm3(…) say:
...to the sum of the ffn output and the output of the second block
But you did not include adding the output of the second block.

rameshgopalan · August 4, 2021, 12:38am

Thanks for your last two tips (the others I had already taken care of) – I was able to complete the assignment, thanks again for your patience Tom Mosher

Topic		Replies	Views
C5W4 exercise 6 DecoderLayer call method Sequence Models coursera-platform	3	674	July 6, 2023
C5_W4_A1_Transformer_Subclass_v1 - class DecoderLayer Sequence Models coursera-platform	4	999	August 23, 2021
C5w4 A1 exercise 6 DecoderLayer() Sequence Models coursera-platform	2	606	February 21, 2022
C5_W4_DecoderLayer_test_A1 Sequence Models coursera-platform	9	942	May 31, 2024
Question about C4_W2_Assignment on Exercise 2 - DecoderLayer NLP with Attention Models week-module-2	8	449	February 19, 2024

C5_W4_A1_UNQ_C6 Decoder Layer

Related topics