C5_W4_A1_exercise6

pongyuenlam · May 27, 2024, 8:21am

I passed Q1, enc_output, padding_mask and set return_attention_scores=True to mha2. I might be wrong here and got error below. Any advice?

gent.spah · May 27, 2024, 8:41am

Well, the shape of the variables being added at multi_attn_out2 is not compatible, so you have to see whee they are originating from.

In the comments it says:

apply layer normalization (layernorm1) to the sum of the attention output and the input (~1 line)

    # BLOCK 2
    # calculate self-attention using the Q from the first block and K and V from the encoder output. 
    # Dropout will be applied during training
    # Return attention scores as attn_weights_block2 (~1 line)

apply layer normalization (layernorm2) to the sum of the attention output and the output of the first block (~1 line)

So you have to use block1 and block2 attention outputs, not the weights!

pongyuenlam · May 28, 2024, 1:22am

Thanks. I can solve it now and can proceed.

Topic		Replies	Views
C4W2-Exercise2 NLP with Attention Models week-module-2	10	284	May 27, 2024
Question about C4_W2_Assignment on Exercise 2 - DecoderLayer NLP with Attention Models week-module-2	8	449	February 19, 2024
Course_5_Encoder layer Sequence Models coursera-platform	2	787	August 25, 2021
C5 W4 A1: Encode layer dropout error Sequence Models coursera-platform	2	652	January 30, 2022
C5_W4_A1_UNQ_C6 Decoder Layer Sequence Models coursera-platform	4	836	August 4, 2021

C5_W4_A1_exercise6

apply layer normalization (layernorm1) to the sum of the attention output and the input (~1 line)

apply layer normalization (layernorm2) to the sum of the attention output and the output of the first block (~1 line)

Related topics