Question about C4_W2_Assignment on Exercise 2 - DecoderLayer

Zhiyi_Li2 · January 28, 2024, 8:25pm

I have a problem on C4_W2_Assignment on Exercise 2- Decoder Layer
The hint is:

BLOCK 2

    # calculate self-attention using the Q from the first block and K and V from the encoder output. 
    # Dropout will be applied during training
    # Return attention scores as attn_weights_block2 (~1 line)

Based on the hint,
My code looks like this:
mult_attn_out2, attn_weights_block2 = scaled_dot_product_attention(Q1, enc_output[1], enc_output[2], padding_mask)

I think parameters not correct, Q1. I assume enc_ouput should be a tuple Q, K, V from encoder.

Any help here?
Zhiyi

gent.spah · January 29, 2024, 7:32am

You have to use variable (functions) defined in the init method of the class above!

Nate_Blaylock · January 31, 2024, 5:20am

I’m still lost here. We have enc_output which is the output of the encoder, which the instructions say we should get K and V from. What I don’t understand is how to get K and V out of enc_output.

I don’t immediately see a variable (function) defined in the init method of DecodedLayer that we would use to get K and V out of enc_output.

Can anyone provide any more information?

gent.spah · January 31, 2024, 5:23am

Yes for this function I have mult_attn_out2, attn_weights_block2 = self.mha2(Q1, enc_output…

Zhiyi_Li2 · February 4, 2024, 10:58pm

Errors still exists:

My code in Exercise 2 Decoder section as:

mult_attn_out2, attn_weights_block2 = self.mha2(Q1, enc_output, look_ahead_mask, padding_mask, return_attention_scores=True)

Test errors were:
Cell In[43], line 67, in DecoderLayer.call(self, x, enc_output, training, look_ahead_mask, padding_mask)
61 Q1 = self.layernorm1(mult_attn_out1 + x)
63 # BLOCK 2
64 # calculate self-attention using the Q from the first block and K and V from the encoder output.
65 # Dropout will be applied during training
66 # Return attention scores as attn_weights_block2 (~1 line)
—> 67 mult_attn_out2, attn_weights_block2 = self.mha2(Q1, enc_output, look_ahead_mask, padding_mask, return_attention_scores=True)

Line 67.

mult_attn_out2, attn_weights_block2 = self.mha2(Q1, enc_output, look_ahead_mask, padding_mask, return_attention_scores=True)

Zhiyi_Li2 · February 4, 2024, 11:04pm

I changed to:
→ 67 mult_attn_out2, attn_weights_block2 = self.mha2(Q1, enc_output, padding_mask, return_attention_scores=True)

But unit test still has errors as:

Cell In[50], line 67, in DecoderLayer.call(self, x, enc_output, training, look_ahead_mask, padding_mask)
61 Q1 = self.layernorm1(mult_attn_out1 + x)
63 # BLOCK 2
64 # calculate self-attention using the Q from the first block and K and V from the encoder output.
65 # Dropout will be applied during training
66 # Return attention scores as attn_weights_block2 (~1 line)
—> 67 mult_attn_out2, attn_weights_block2 = self.mha2(Q1, enc_output, padding_mask, return_attention_scores=True)
69 # apply layer normalization (layernorm2) to the sum of the attention output and the output of the first block (~1 line)
70 mult_attn_out2 = self.layernorm1(mult_attn_out2 + Q1)

InvalidArgumentError: Exception encountered when calling layer ‘key’ (type EinsumDense).

Anyone help ?

Zhiyi_Li2 · February 5, 2024, 6:35am

Never mind, solved.

Abuzar_Shahid1 · February 15, 2024, 2:14pm

how you solved it. please tell me

Tanushri_Bhaduri · February 19, 2024, 8:27am

I am getting the same error. Can you please tell me how you solved it.

Topic		Replies	Views
C5_W4_A1_UNQ_C6 Decoder Layer Sequence Models coursera-platform	4	837	August 4, 2021
C4W2_Assignment - Ex 7 Decoder Layer output NLP with Attention Models week-module-2	12	380	April 4, 2024
C4W2-Exercise2 NLP with Attention Models week-module-2	10	292	May 27, 2024
C5_W4_A1_Transformer_Subclass_v1 # BLOCK 2 Sequence Models coursera-platform	1	685	July 18, 2021
C4W2_Assignment_Transformer Summarizer_Exercise 2 - DecoderLayer NLP with Attention Models week-module-4	4	65	December 3, 2024

Question about C4_W2_Assignment on Exercise 2 - DecoderLayer

BLOCK 2

Related topics