C5_W4_A1 Wrong values in attn_w_b2. Check the call to self.mha2

I am building out the decoder for UNQ_C6 and receive this error message:
AssertionError: Wrong values in attn_w_b2. Check the call to self.mha2

Right now I’m using the normalized sum of the attention output and input as Q. For K and V I’m using enc_output. I’m also adding padding_mask as specified in the function defintions.

I’ve read other posts about this but decided to create a new one since everything is several years old now. I tried usingx instead of enc_output but it goes against my understanding of what we’re doing with the transformer. I wonder if using attn_weights_block1 is the right path but I’m getting the same error after using them.

Are you using the return_attention_scores parameter in self.mha2, at all?

Yes, I have set them to True. Wether True or False, I’m getting this error.

Perhaps the error is not with the mha2() call. That’s just a blanket suggestion from the notebook test case - it often is not the actual cause.

Interesting. Assuming the error is not in the self.mha2() call, I tried braking other parts of the code in this cell before calling the multi head attention block 2. I’m also assuming previous cells are correct given that they have all returned positive test results. This would mean the error may be in defining Q1, mult_attn_out1, or attn_weights_block1.

Block 1 mha: I’m using input x three times to perform self-attention plus a look ahead mask and returning attention weights. If I use enc_output instead of x I get an error at self.mha1. I wonder if I also need to indicate that I’m training to perform dropout but in my attempts this has also failed.

Q1: This is pretty straightforward, I’m adding mult_attn_out1 and enc_output.


Update: I know realize I had to sum mult_attn_out1 and x, instead of enc_output. This solves my issue.

2 Likes