Hi, I’m having problems completing the Week 2 assignment.
Exercise 1 (attention mechanism) and 2 (decoder layer) are both passing all the tests. But I cannot get exercise 3 (decoder) to pass the tests:
Failed test case: Wrong values in x.
Expected: [1.6461557, -0.7657816, -0.04255769, -0.8378165]
Got: [ 1.5914114 -0.4431648 -0.0210509 -1.1271956]
Failed test case: Wrong values in outd when training=True.
Expected: [1.6286429, -0.7686589, 0.00983591, -0.86982]
Got: [ 1.5467136 -1.175141 0.09500299 -0.46657562]
Failed test case: Wrong values in outd when training=True and use padding mask.
Expected: [1.390952, 0.2794097, -0.2910638, -1.3792979]
Got: [ 1.5235926 0.24524598 -0.7199153 -1.0489233 ]
If anyone would be so kind to have a look at my solution, I’ll happily send it privately.
Thanks!
Hi @MatejPolak
Try to find the problem in your code by yourself (e.g. debugging, printing intermediate values, etc.). However, if you feel stuck, I can take a look at your code!
Hi @MatejPolak
for a decode test to fail your output shows, you have incorrect x and also wrong values with training=true and use of padding mask.
Debugging task would be
to check first if you applied the same layer recall you used in attention mechanism. Also check the instructions section which mentions about the padding value.
next in case you have followed the above steps, then it means you need to go back in previous grader cell to check if you recalled each layer according to the instructions given.
in case you are able to find, kindly send screenshot of codes to any of the mentors who like them to check codes.
Your output gives details about where you might be going wrong, so re-read instructions again and again. Use search tool here, for learners who were stuck in the same grade cells. I am sure you will be able to debug!!!
Regards
DP
Hi,
thanks to @Alireza_Saei 's kind review of my code, I found that the problem was that I was not using the argument “training” correctly in exercise 2 (!). I used this:
mult_attn_out1, attn_weights_block1 = self.mha1(..., training=True, ...)
instead of this:
mult_attn_out1, attn_weights_block1 = self.mha1(..., training=training, ...)
The exercise 2 tests did not catch this error.
Thanks for the help!
Cheers,
Matej
1 Like