I don’t understand why this error appears.
AssertionError: Wrong values in attn_w_b2. Check the call to self.mha2
why is it wrong ?
# BLOCK 2
# calculate self-attention using the Q from the first block and K and V from the encoder output.
# Dropout will be applied during training
# Return attention scores as attn_weights_block2 (~1 line)
mult_attn_out2, attn_weights_block2 = self.mha2(Q1, enc_output, enc_output, attention_mask=padding_mask, return_attention_scores=True) # (batch_size, target_seq_len, embedding_dim)
Ensure you are passing the training parameter correctly to handle dropout (training=training). If the issue persists, ensure padding_mask is correctly broadcastable to the shapes required by self.mha2 and Q1 and enc_output have correct shapes.
Hope this help, feel free to ask if you nees further assistance!
Yes, in Block 2, dropout should be applied during training. The corresponding comment line suggests that you need to pass the training argument correctly so that dropout can be applied.
Yes,
I’ve finally found the mistake. The error comes from parameters in self.mha1 and not in self.mha2 such as the error looks like to indicate.
Instead of (x,x,x,…), I had written (x, enc_ouput,enc_output, …) on mha1…