I got all the shapes right (the tests on shapes all passed). But I got the failed test on values:
I used the padding mask in the second mha layer, so do not know how did I do wrongly. My code is like this
May someone hint me?
I got all the shapes right (the tests on shapes all passed). But I got the failed test on values:
I used the padding mask in the second mha layer, so do not know how did I do wrongly. My code is like this
May someone hint me?
Hello @Jiayin_Guo!
Your first equation is ok, the issue is in the second equation.
In the first equation you correctly use the normalized output of the first block.
Why use its form prior to normalization in the second? (I assume the different phrasing of comments-instructions for the two equations may have played a role…)
Please remove the solutions from the post
Best
Thanks a lot! @Anna_Kay
Solved the errors. I’ve removed the code lines.