Course 5: Week 4 Transformer DecoderLayers

gchurch · May 5, 2021, 9:09am

I came across an issue with the encoding DecoderLayer function. I believe that I have everything correct however the Unit Test is failing with the error:

AssertionError: Wrong values in attn_w_b2. Check the call to self.mha2

However when I edit out the assert test in the Unit Test section it passes all tests and also I complete the exercise 100% according to the grader. I’m a little confused whether I actually have it correct.

I have the following code for the 2nd attention:
attn2, attn_weights_block2 = self.mha2(enc_output, enc_output, out1, padding_mask, return_attention_scores=True)

I have been following: Neural machine translation with a Transformer and Keras | Text | TensorFlow

sebastian.birk · May 6, 2021, 6:56am

I have the same finding…

anand · May 6, 2021, 1:00pm

Hi
I am also getting the same error,
Cell #20. Can’t compile the student’s code. Error: AssertionError(‘Wrong values in attn_w_b2. Check the call to self.mha2’)
But I dont see any issue with the code, but an error with the test code.
Can grader please check and response it once.
Thanks
Anand

rpatters1 · May 6, 2021, 8:51pm

I am having the same problem as the OP. If I comment out the DecoderLayer assert, then the Decoder tests pass. If I uncomment DecoderLayer and modify DecoderLayer to remove the error, then the Decoder tests fail. There appears to be discrepancy between the mh2 call in this assignment and the corresponding call on which it is clearly based at this site:

I am not sure which is correct, because based on the comments in the exercise and the documentation for MultiHeadAttention layer, the exercise should be correct. However, if I change my code to match that at the tensorflow.org page, then it flows thru except for your assert error.

EDIT: Like OP, if I edit out the offending asserts in the DecoderLayer_test block, I can send the entire assignment to the grader and receive 100/100.

sbrugnot · May 6, 2021, 11:07pm

I think the issue is that the order of the parameters passed to the mha call differ between the assignment (which uses TensorFlow’s built-in MultiHeadedAttention class) and the “Transformer model for language understanding” webpage referenced here (which implements its own version of the same class).

Nonetheless, even after fixing that error, I still run into the dreaded ‘outd’ assert issue at the Decoder stage

EDIT: I tried to use the “incorrect” parameter order and comment out the unit test as suggested above, but I ran into the “Cell 20: can’t compile the student’s code” error, which failed the autograder.

marcus-waldman · May 7, 2021, 11:19pm

I commented out the following two assertion statement, after which I also was able to get a 100 on the assignment.
#assert np.allclose(attn_w_b2[0, 0, 1], [0.34485385, 0.33230072, 0.32284543]), “Wrong values in attn_w_b2. Check the call to self.mha2”
#assert np.allclose(out[0, 0], [0.64775777, -1.5134472, 1.1092964, -0.24360693]), “Wrong values in out”

So, yes, either there is something wrong with the assertion statements and it is correct to follow the TensorFlow guide verbatim, or we are bypassing an error that should be flagged and the self.mha2 call is different from the guide.

meiyou · May 16, 2021, 5:14am

in TF’s documentation about MultiHeadAttention layer, tensors passed through the layer are: query, value, key, attention_mask. (Note the order of the parameters in call argument, “query” comes before “value” and “key”).

Falk · May 21, 2021, 8:03pm

I passed the mha2 parameters via “query=out1,value=enc_output,key=enc_output” so that correct order according to the lecture is ensured and got same issues with full decoder test unit.

Just passing mha2 parameters via “self.mha2( enc_output, enc_output, out1, …” helps with that issue (after commenting out 2 lines in the DecoderLayer_test – the grader doesn’t seem to work if the tests fail.)
However, In the Decoder_test the line
assert np.allclose(outd[1, 1], [-0.34560338, -0.8762897, -0.4767484, 1.6986415]), “Wrong values in outd”)
is now OK (& the grader will give 100/100.)
=> Consequently, that line in Decoder_test is based on passing the args wrongly (or at least inconsistent with the lecture).
I think those test values should be updated by the mentors.

Thanks to all for uncovering the parameter issue. Was a rather tedious bug!

edwardyu · May 22, 2021, 12:34am

Hi,
Please refer to this thread.

devreev · June 6, 2021, 7:27am

Swapping positions of out1, enc_output, enc_output, does the trick

amaliat · July 15, 2021, 5:27pm

Hi,
I am stucked also with exercice 6 and even if I swapped the positions out1, enc_output, enc_output in self.mha2 I still have the same problem with the Unit test for Decoder_layer.
The deadline is on Monday, I do not know what to do ?

TMosh · July 15, 2021, 6:31pm

# BLOCK 1
        # calculate self-attention and return attention scores as attn_weights_block1 (~1 line)
        attn1, attn_weights_block1 = self.mha1(x, x, x, look_ahead_mask, return_attention_scores=True)

# BLOCK 2
        # calculate self-attention using the Q from the first block and K and V from the encoder output.
        # Return attention scores as attn_weights_block2 (~1 line)
        attn2, attn_weights_block2 = self.mha2(out1, enc_output, enc_output, padding_mask, return_attention_scores=True)

amaliat · July 15, 2021, 8:20pm

This is what I did, but I still the initial problem

AssertionError Traceback (most recent call last)
in
46 print("\033[92mAll tests passed")
47
—> 48 DecoderLayer_test(DecoderLayer)

in DecoderLayer_test(target)
36 assert np.allclose(attn_w_b1[0, 0, 1], [0.5271505, 0.47284946, 0.], atol=1e-2), “Wrong values in attn_w_b1. Check the call to self.mha1”
37 assert np.allclose(attn_w_b2[0, 0, 1], [0.33365652, 0.32598493, 0.34035856]), “Wrong values in attn_w_b2. Check the call to self.mha2”
—> 38 assert np.allclose(out[0, 0], [0.04726627, -1.6235218, 1.0327158, 0.54353976]), “Wrong values in out”
39
40

AssertionError: Wrong values in out

TMosh · July 16, 2021, 2:49am

Send me your code for the DecoderLayer() function by a direct message. I will review it.

jman · August 9, 2021, 2:50pm

Did you get a solution for this ?

Francisco_Quagliotti · April 22, 2022, 8:58pm

Hi, I am also having this issue:

AssertionError Traceback (most recent call last)
in
1 # UNIT TEST
----> 2 DecoderLayer_test(DecoderLayer, create_look_ahead_mask)

~/work/W4A1/public_tests.py in DecoderLayer_test(target, create_look_ahead_mask)
180 assert np.allclose(attn_w_b1[0, 0, 1], [0.5271505, 0.47284946, 0.], atol=1e-2), “Wrong values in attn_w_b1. Check the call to self.mha1”
181 assert np.allclose(attn_w_b2[0, 0, 1], [0.32048798, 0.390301, 0.28921106]), “Wrong values in attn_w_b2. Check the call to self.mha2”
→ 182 assert np.allclose(out[0, 0], [-0.22109576, -1.5455486, 0.852692, 0.9139523]), “Wrong values in out”
183
184

AssertionError: Wrong values in out

TMosh · April 23, 2022, 4:06am

There appears to be an error in your DecoderLayer() code.

Le_Minh_Tuong · July 20, 2022, 9:49am

So helpful !!!

TMosh · March 7, 2024, 7:40pm

I have unlisted this thread because its contents are 2.5 years old, and have become obsolete (it recommended modifying the public_utils.py file, which is not a good idea).

Topic		Replies	Views
Programming Assignment: Transformers Architecture with TensorFlow-Exercise 6 - DecoderLayer Deep Learning Resources	4	369	July 28, 2025
Weak4-Assignment 1 Sequence Models coursera-platform	1	434	July 19, 2023
Course 5, Week 4, Transformer Sequence Models coursera-platform	4	987	July 27, 2021
C5_W4_A1_UNQ_C6 Decoder Layer Sequence Models coursera-platform	4	865	August 4, 2021
Transformers Programming Assignment DecoderLayer self.mha2 invoice issue NLP with Attention Models week-module-2	1	337	November 8, 2023

Course 5: Week 4 Transformer DecoderLayers

Related topics