C5_W4_A1_exercise6

I passed Q1, enc_output, padding_mask and set return_attention_scores=True to mha2. I might be wrong here and got error below. Any advice?

Well, the shape of the variables being added at multi_attn_out2 is not compatible, so you have to see whee they are originating from.

In the comments it says:

apply layer normalization (layernorm1) to the sum of the attention output and the input (~1 line)

    # BLOCK 2
    # calculate self-attention using the Q from the first block and K and V from the encoder output. 
    # Dropout will be applied during training
    # Return attention scores as attn_weights_block2 (~1 line) 

apply layer normalization (layernorm2) to the sum of the attention output and the output of the first block (~1 line)

So you have to use block1 and block2 attention outputs, not the weights!

Thanks. I can solve it now and can proceed.

1 Like