C4_W2 Programming Assignment: Transformer Summarizer error grading :a problem compiling the code from your notebook. Details: Exception encountered when calling layer 'softmax_3' (type Softmax)

Hi @TungTTTHE172215

You probably made the same mistake.

Note, that when creating padding mask for the decoder’s second attention block - we use the encoder_input. In other words, we inform the decoder to not pay attention to padding tokens of the document to be summarized.

Also note, that this is different from look_ahead_mask (causal mask) where decoder is only allowed to pay attention to itself and its previous tokens.

Cheers