Week2 Assignment - can't figure out padding_mask step. Any hints?

julianhatwell · January 26, 2024, 7:46am

I’m really struggling to figure out how to clear this error. I understand the need for padding_mask but I don’t know where in the routine to apply it.

GRADED FUNCTION: DecoderLayer…

Failed test case: Wrong values in ‘out’ when we mask the last word. Are you passing the padding_mask to the inner functions?.
Expected: [1.1297308, -1.6106694, 0.32352272, 0.15741566]
Got: [ 1.2026718 -1.5374368 0.4260614 -0.09129634]

arvyzukai · January 27, 2024, 10:22am

Hi @julianhatwell

The padding mask for the Decoder should be applied in the DecoderLayer itself (Exercise 2), after this code comment:

        # BLOCK 2
        # calculate self-attention using the Q from the first block and K and V from the encoder output. 
        # Dropout will be applied during training
        # Return attention scores as attn_weights_block2 (~1 line)

and also in the Decoder class (Exercise 3), after this code comment:

            # pass x and the encoder output through a stack of decoder layers and save the attention weights
            # of block 1 and 2 (~1 line)

In other words, to pass this unit test, you need the padding mask in these two places. The padding mask is used to inform the Decoder layers’ Cross Attention which Encoder’s input tokens were padded (in order for the Decoder better decide which Encoder inputs are relevant).

Cheers

julianhatwell · January 28, 2024, 8:19am

OK, I have passing unit tests now. Thank you. I was over-thinking the process. It’s the simplest possible step given the information in your answer. Thanks.

Topic		Replies	Views
C4CW2 Failing Test Cases for Exercise 3 - Decoder NLP with Attention Models week-2	2	418	January 8, 2024
C4W2_Assignment Exercise 3 Decoder NLP with Attention Models week-2	3	438	March 2, 2024
Question about C4_W2_Assignment on Exercise 2 - DecoderLayer NLP with Attention Models week-2	8	432	February 19, 2024
C4W2 Assignment: decoder not passing tests NLP with Attention Models week-2	3	97	June 24, 2024
Error in C4W2- Exercise 2 NLP with Attention Models week-2	2	40	August 19, 2024

Week2 Assignment - can't figure out padding_mask step. Any hints?

GRADED FUNCTION: DecoderLayer…

Related topics