Hi
I have a problem with the exercice 6 : DecoderLayer_test:
===============================================
AssertionError Traceback (most recent call last)
in
46 print("\033[92mAll tests passed")
47
—> 48 DecoderLayer_test(DecoderLayer)
in DecoderLayer_test(target)
36 assert np.allclose(attn_w_b1[0, 0, 1], [0.5271505, 0.47284946, 0.], atol=1e-2), “Wrong values in attn_w_b1. Check the call to self.mha1”
37 assert np.allclose(attn_w_b2[0, 0, 1], [0.33365652, 0.32598493, 0.34035856]), “Wrong values in attn_w_b2. Check the call to self.mha2”
—> 38 assert np.allclose(out[0, 0], [0.04726627, -1.6235218, 1.0327158, 0.54353976]), “Wrong values in out”
39
40
AssertionError: Wrong values in out
I checked again and again but still no solution to pass this test.
Any idea?
TMosh
July 14, 2021, 11:45pm
2
Your DecoderLayer() function doesn’t return the correct results for the first test case.
The error message doesn’t reveal much more.
Hi,
I guess the problem comes from:
attn2, attn_weights_block2 = self.mha2(…)
{mentor edit: code details removed}
Still not work
I assume I do something wrong in the DecoderLayer
{mentor edit: code removed}
TMosh
July 16, 2021, 7:04am
5
I recommend you pay closer attention to the details:
{mistake has been corrected, the dropout layers need the training argument}
In Block 2, you repeated dropout1(), it should be dropout2().
In Block 2, you used layernorm1() where it should be layernorm2().
In Block 3, you used ffn() twice. The second one should be dropout3().
In Block 3, you used layernorm1() again, it should be layernorm3().
Hi,
thank you. Now it works!
kechan
July 17, 2021, 4:37pm
7
Why is dropout1() should not have a training argument? Since we are doing this in call(…), dropout needs to be inactivated during inference (where training is False). I actually do pass this argument to dropout1 and unit test passed.
Adding: I also got 100% after submission. So it isnt only passing the unit test. I hope the “test” in the backend is strong enough to discern if adding training to dropout(…) inside a layer class’s call(…) is right thing to do. I do think it is correct from what I understood…
You’re correct. The training
flag should be passed to the dropout layer.
TMosh
July 19, 2021, 3:31am
9
Fixed it, thanks for catching that.
The unit tests and grader don’t care if training is specified for the dropout layers, that’s how it snuck through.