[Week 4] Assignment 1 >> Decoder class error

:astonished: It worked! But this is kind of a hack and now that you mentioned that the fix is from a “tensorflow tutorial” where I spotted potential bugs in like the below
attn1, attn_weights_block1 = self.mha1(x, x, x, look_ahead_mask) # will probably through an error due to missing return_attention_scores flag
and given that exercises 7/8 and 8/8 passed with the “incorrect” implementation, I kind of know the source of the error :grin:

Well, the way you hacked is just the bug comes from. The call arguments sequence of MultiHeadAttention is different between Transformer tutorial and tensorflow core api (see Call arguments).
The sequence in tutorial is v, k, q, but q, v, k in core api, here is core api implement.

No matter you decided to hack or not, be sure to understand what these arguments (q, k, v) are for. That’s what we supposed to learn.


Totally agree with you!

We need to ping this thread to top as folks won’t waste time on this issue

I got same error.Anyone whose error has sorted.

Same error here as well. Tried a bunch of different combinations of q, k, v / v, k, q etc. but couldn’t break the 75/100 barrier.

Changing the order of q, k, v to v, k, q didn’t pass the tests. Could you please elaborate on this?

Looking forward to the fix :slight_smile:

My subscription ends on May 7 and I have the same issue! anyone had a similar experience where there was such a bug? I checked my code on the Tensorflow website and it is the same thing there, so I don’t think it’s a code problem and more a bug in the tests.

Please check out this workaround if you are in a hurry.

Fady_Michel, Thank you so much. I already did it and it worked🤩. Many thanks for your help

Actually this was all @RolandSherwin !
But your are welcome :relaxed:

Did the same and it also worked for me…

How do I know when it is fixed? Will the status be posted in this thread?

Yes, I’ll post message here, once course staffs fixed it.

same issue I am facing too.

DecoderLayer class

It worked for me… Thank you!!

Hello there, I have completed the assignment as suggested to get 100/100 but it fails. Here is my code, I need to get it done by Monday. Can you help me with that?

MultiheadAttention — PyTorch 1.8.1 documentation.

Check the order of KW^k,VW^v and QW^q

Thanks. Super hacky, but it solves the problem. Let’s hope this is fixed soon so that others don’t have to go through this.

