C4_W2 - Assignment 2: Transformer Summarizer - Exercise 2: DecoderLayer

I got all the shapes right (the tests on shapes all passed). But I got the failed test on values:

I used the padding mask in the second mha layer, so do not know how did I do wrongly. My code is like this

May someone hint me?

Hello @Jiayin_Guo!

Your first equation is ok, the issue is in the second equation.
In the first equation you correctly use the normalized output of the first block.
Why use its form prior to normalization in the second? (I assume the different phrasing of comments-instructions for the two equations may have played a role…)


Please remove the solutions from the post :slight_smile:


1 Like

Thanks a lot! @Anna_Kay

Solved the errors. I’ve removed the code lines.

1 Like