Hi,
I am trying to solve C4 Week 2 NLP assignment where I am having trouble getting the right output from block 2. I am taking the output of block 1 and setting it as input to block 2. So block 2 now has as input Q1 from the first block and K and V from the encoder. I think I am doing the right thing but the output is in error as follows
In the test your function block all results are right except the last one
My output:
Using embedding_dim=12 and num_heads=16:
q has shape:(1, 15, 12)
Output of encoder has shape:(1, 7, 8)
Output of decoder layer has shape:(1, 15, 12)
Att Weights Block 1 has shape:(1, 16, 15, 15)
Att Weights Block 2 has shape:(1, 16, 15, 15) ← incorrect output line
Correct output line should be:
Att Weights Block 2 has shape:(1, 16, 15, 7)
Here is the relevant part of the code I am running. Can someone please point out what my error is:
{CODES REMOVED BY MODERATOR}
It results in the output as follows:
The output of test_decoder_layer() is as follows
enc_output.shape=(1, 3, 4)
Q1.shape=(1, 3, 4)
mult_attn_out1.shape=(1, 3, 4)
enc_output.shape=(1, 3, 4)
Q1.shape=(1, 3, 4)
mult_attn_out1.shape=(1, 3, 4)
Failed test case: Wrong values in ‘attn_w_b2’. Check the call to self.mha2.
Expected: [0.34003818, 0.32569194, 0.33426988]
Got: [0.331976 0.331976 0.336048]
Failed test case: Wrong values in ‘out’.
Expected: [1.1810006, -1.5600019, 0.41289005, -0.03388882]
Got: [ 1.571806 -1.1318253 0.05532974 -0.49531013]
Failed test case: Wrong values in ‘out’ when we mask the last word. Are you passing the padding_mask to the inner functions?.
Expected: [1.1297308, -1.6106694, 0.32352272, 0.15741566]
Got: [ 1.5595117 -1.146292 0.08369908 -0.496919 ]
I dont think I am applying the input of block 2 properly but I am unable to understarnd why.
Thank you for your help
Sidd