C4W2_Assignment DecoderLayer Issue Facing

Using embedding_dim=12 and num_heads=16:
q has shape:(1, 15, 12)
Output of encoder has shape:(1, 7, 8)

Output of decoder layer has shape:(1, 15, 12)
Att Weights Block 1 has shape:(1, 16, 15, 15)
Att Weights Block 2 has shape :(1, 16, 15, 7)

Expected Output
Output:
Using embedding_dim=12 and num_heads=16:

q has shape:(1, 15, 12)
Output of encoder has shape:(1, 7, 8)

Output of decoder layer has shape:(1, 15, 12)
Att Weights Block 1 has shape:(1, 16, 15, 15)
Att Weights Block 2 has shape:(1, 16, 15, 3)
1 Like

Hi @Muhammad_Shariq_Shoa

It’s against the Code of Conduct to share your code, so please remove it.

Cheers

1 Like

Hi, I’m also facing the same issue where the shape of attn_weights_block2 is (1, 16, 15, 7) while it suppose to be (1, 16, 15, 3).
The shape of the other two values out3 and attn_weights_block1 are still correct

2 Likes

Same problem here.

@lucas.coutinho , attn_weights_block2 should be of shape: (batch_size, num_heads, target_seq_len, input_seq_len) as per the Return comments in the call function. And input_seq_len is indeed 7.
Further, attn_weights_block1 and attn_weights_block2 should be of different shapes contrary to what is mentioned in commented block of call function.

Kindly can you guide, if there’s issue with our solution approach or this reported bug is legit?

@Muhammad_Shariq_Shoa We are looking into this at our end and give an update as soon as possible.

Hi guys, thanks for your input on this. The issue is already in github and I have assigned it to your Curriculum Engineer, as soon as I get an update I will post here.

Hi all! The mkdown has a mistake, this will be fixed today. (You might not see the change unless you refresh your assignment to get the latest version but it will be fixed for future learners automatically)

Not only markdown but also we are facing issue in test cases of it due to shape

Also this leads to
Failed test case: Wrong values in ‘attn_w_b2’. Check the call to self.mha2.
Expected: [0.34003818, 0.32569194, 0.33426988]
Got: [0.34083953 0.32673767 0.33242285]

Failed test case: Wrong values in ‘out’.
Expected: [1.1810006, -1.5600019, 0.41289005, -0.03388882]
Got: [ 1.3311304 -1.4207214 0.365438 -0.275847 ]

Failed test case: Wrong values in ‘out’ when we mask the last word. Are you passing the padding_mask to the inner functions?.
Expected: [1.1297308, -1.6106694, 0.32352272, 0.15741566]
Got: [ 1.3888907 -1.414115 0.2009444 -0.17572011]
@a-zarta @lucas.coutinho @jyadav202

Are you certain there isn’t a defect in your code? Because if the test case and shape were incorrect, I think we would have a lot more reports about the issue than just yours.

I believe, as I have gone through tf documentation too

@Muhammad_Shariq_Shoa can you share your notebook via DM and I’ll take a look?