C5_W4_A1_Transformer_Subclass_v1 - class DecoderLayer

cleber · August 20, 2021, 5:20pm

That is my code:

def call(self, x, enc_output, training, look_ahead_mask, padding_mask):
“”"
Forward pass for the Decoder Layer

    Arguments:
        x -- Tensor of shape (batch_size, target_seq_len, fully_connected_dim)
        enc_output --  Tensor of shape(batch_size, input_seq_len, fully_connected_dim)
        training -- Boolean, set to true to activate
                    the training mode for dropout layers
        look_ahead_mask -- Boolean mask for the target_input
        padding_mask -- Boolean mask for the second multihead attention layer
    Returns:
        out3 -- Tensor of shape (batch_size, target_seq_len, fully_connected_dim)
        attn_weights_block1 -- Tensor of shape(batch_size, num_heads, target_seq_len, input_seq_len)
        attn_weights_block2 -- Tensor of shape(batch_size, num_heads, target_seq_len, input_seq_len)
    """
    
    # START CODE HERE
    # enc_output.shape == (batch_size, input_seq_len, fully_connected_dim)
    
    # BLOCK 1
    # calculate self-attention and return attention scores as attn_weights_block1.
    # Dropout will be applied during training (~1 line).
    # (batch_size, target_seq_len, d_model)
    mult_attn_out1, attn_weights_block1 = self.mha1(x, x, x, look_ahead_mask, return_attention_scores=True)
    
    # apply layer normalization (layernorm1) to the sum of the attention output and the input (~1 line)
    Q1 = self.layernorm1(mult_attn_out1 + x)

    # BLOCK 2
    # calculate self-attention using the Q from the first block and K and V from the encoder output. 
    # Dropout will be applied during training
    # Return attention scores as attn_weights_block2 (~1 line) 
    mult_attn_out2, attn_weights_block2 = self.mha2(Q1, enc_output, enc_output, padding_mask, return_attention_scores=True)  # (batch_size, target_seq_len, d_model)
    
    # apply layer normalization (layernorm2) to the sum of the attention output and the output of the first block (~1 line)
    mult_attn_out2 = self.layernorm2(mult_attn_out2 + Q1)  # (batch_size, target_seq_len, fully_connected_dim)
            
    #BLOCK 3
    # pass the output of the second block through a ffn
    ffn_output = self.ffn(mult_attn_out2)  # (batch_size, target_seq_len, fully_connected_dim)
    
    # apply a dropout layer to the ffn output
    ffn_output = self.dropout_ffn(mult_attn_out2, training=training)
    
    # apply layer normalization (layernorm3) to the sum of the ffn output and the output of the second block
    out3 = self.layernorm3(ffn_output + mult_attn_out2)  # (batch_size, target_seq_len, fully_connected_dim)
    # END CODE HERE

    return out3, attn_weights_block1, attn_weights_block2

I’m getting the following error:

AssertionError Traceback (most recent call last)
in
1 # UNIT TEST
----> 2 DecoderLayer_test(DecoderLayer, create_look_ahead_mask)

~/work/W4A1/public_tests.py in DecoderLayer_test(target, create_look_ahead_mask)
180 assert np.allclose(attn_w_b1[0, 0, 1], [0.5271505, 0.47284946, 0.], atol=1e-2), “Wrong values in attn_w_b1. Check the call to self.mha1”
181 assert np.allclose(attn_w_b2[0, 0, 1], [0.32048798, 0.390301, 0.28921106]), “Wrong values in attn_w_b2. Check the call to self.mha2”
→ 182 assert np.allclose(out[0, 0], [-0.22109576, -1.5455486, 0.852692, 0.9139523]), “Wrong values in out”
183
184

AssertionError: Wrong values in out

What am I doing wrong?

agparth · August 22, 2021, 11:51am

Hi,

I have moved this post to the DLS Course 5 category as mentors of the Sequence models course might help you out here with this query.

Make sure if you have any course-specific doubts, explore the specialization category and post in the relevant course subcategory as course-specific mentors are actively answering the queries there and the threads in the general discussion might not come on their radar and remain unanswered at times.

TMosh · August 22, 2021, 11:58pm

In your dropout_ffn() layer, you’re passing the wrong data argument.

kanakmi · August 23, 2021, 3:34am

Thank you so much. I was getting the exact same error for quite a long time now and it only took me a minute to figure out the mistake after your reply.

cleber · August 23, 2021, 10:28am

Thank you so much! Now its fine!

Topic		Replies	Views
C5 W4 A1 Encoder Layer isn't working Sequence Models	1	954	July 30, 2021
C5_W4_A1_UNQ_C6 Decoder Layer Sequence Models	4	835	August 4, 2021
C5_W4_A1 UNQ_C4 Encoder Layer Mask Sequence Models	16	1063	August 3, 2021
C5 - W4 - Transformers Architecture, 3rd June 2021 version Sequence Models	2	743	June 9, 2021
Transformer assignment MultiHeadAttention call Sequence Models	1	517	March 21, 2022

C5_W4_A1_Transformer_Subclass_v1 - class DecoderLayer

I’m getting the following error:

Related topics