I have a problem on C4_W2_Assignment on Exercise 2- Decoder Layer
The hint is:
BLOCK 2
# calculate self-attention using the Q from the first block and K and V from the encoder output.
# Dropout will be applied during training
# Return attention scores as attn_weights_block2 (~1 line)
Based on the hint,
My code looks like this:
mult_attn_out2, attn_weights_block2 = scaled_dot_product_attention(Q1, enc_output[1], enc_output[2], padding_mask)
I think parameters not correct, Q1. I assume enc_ouput should be a tuple Q, K, V from encoder.
Any help here?
Zhiyi
You have to use variable (functions) defined in the init method of the class above!
I’m still lost here. We have enc_output which is the output of the encoder, which the instructions say we should get K and V from. What I don’t understand is how to get K and V out of enc_output.
I don’t immediately see a variable (function) defined in the init method of DecodedLayer that we would use to get K and V out of enc_output.
Can anyone provide any more information?
Yes for this function I have mult_attn_out2, attn_weights_block2 = self.mha2(Q1, enc_output…
Errors still exists:
My code in Exercise 2 Decoder section as:
mult_attn_out2, attn_weights_block2 = self.mha2(Q1, enc_output, look_ahead_mask, padding_mask, return_attention_scores=True)
Test errors were:
Cell In[43], line 67, in DecoderLayer.call(self, x, enc_output, training, look_ahead_mask, padding_mask)
61 Q1 = self.layernorm1(mult_attn_out1 + x)
63 # BLOCK 2
64 # calculate self-attention using the Q from the first block and K and V from the encoder output.
65 # Dropout will be applied during training
66 # Return attention scores as attn_weights_block2 (~1 line)
—> 67 mult_attn_out2, attn_weights_block2 = self.mha2(Q1, enc_output, look_ahead_mask, padding_mask, return_attention_scores=True)
Line 67.
mult_attn_out2, attn_weights_block2 = self.mha2(Q1, enc_output, look_ahead_mask, padding_mask, return_attention_scores=True)
I changed to:
→ 67 mult_attn_out2, attn_weights_block2 = self.mha2(Q1, enc_output, padding_mask, return_attention_scores=True)
But unit test still has errors as:
Cell In[50], line 67, in DecoderLayer.call(self, x, enc_output, training, look_ahead_mask, padding_mask)
61 Q1 = self.layernorm1(mult_attn_out1 + x)
63 # BLOCK 2
64 # calculate self-attention using the Q from the first block and K and V from the encoder output.
65 # Dropout will be applied during training
66 # Return attention scores as attn_weights_block2 (~1 line)
—> 67 mult_attn_out2, attn_weights_block2 = self.mha2(Q1, enc_output, padding_mask, return_attention_scores=True)
69 # apply layer normalization (layernorm2) to the sum of the attention output and the output of the first block (~1 line)
70 mult_attn_out2 = self.layernorm1(mult_attn_out2 + Q1)
InvalidArgumentError: Exception encountered when calling layer ‘key’ (type EinsumDense).
Anyone help ?
how you solved it. please tell me
I am getting the same error. Can you please tell me how you solved it.