C5 W4 UNQ_C7 Wrong values in outd

x = self.embedding(x) # (batch_size, target_seq_len, fully_connected_dim)

    # scale embeddings by multiplying by the square root of their dimension
    x *= tf.math.sqrt(tf.cast(self.embedding_dim, tf.float32))
    
    # calculate positional encodings and add to word embedding
    x += self.pos_encoding[:, :seq_len, :]

    # apply a dropout layer to x
    x = self.dropout(x, training=training)

    # use a for loop to pass x through a stack of decoder layers and update attention_weights (~4 lines total)
    for i in range(self.num_layers):
        # pass x and the encoder output through a stack of decoder layers and save the attention weights
        # of block 1 and 2 (~1 line)
        x, block1, block2 = self.dec_layers[i](x, enc_output, training,
                                               look_ahead_mask, padding_mask)

        #update attention_weights dictionary with the attention weights of block 1 and block 2
        attention_weights['decoder_layer{}_block1_self_att'.format(i+1)] = block1
        attention_weights['decoder_layer{}_block2_decenc_att'.format(i+1)] = block2

Can someone tell me whats wrong with my code I am getting
Wrong values in outd when training=True and use padding mask
this as the error

I don’t see any obvious problem with the code you posted.

Please post the entire assert stack of error messages.

AssertionError Traceback (most recent call last)
in
1 # UNIT TEST
----> 2 Decoder_test(Decoder, create_look_ahead_mask, create_padding_mask)

~/work/W4A1/public_tests.py in Decoder_test(target, create_look_ahead_mask, create_padding_mask)
235
236 outd, att_weights = decoderk(x, encoderq_output, True, look_ahead_mask, create_padding_mask(x))
→ 237 assert np.allclose(outd[1, 1], [-0.0250004, 0.50791883, -1.5877104, 1.1047921]), “Wrong values in outd when training=True and use padding mask”
238
239 print("\033[92mAll tests passed")

AssertionError: Wrong values in outd when training=True and use padding mask

Did you modify anything in the create_look_ahead_mask() or create_padding_mask() functions?

Does your scaled_dot_product_attention() function pass its unit tests?

This is my create_padding_mask function:

def create_padding_mask(decoder_token_ids):
“”"
Creates a matrix mask for the padding cells

Arguments:
    decoder_token_ids -- (n, m) matrix

Returns:
    mask -- (n, 1, 1, m) binary tensor
"""    
seq =  tf.cast(tf.math.equal(decoder_token_ids, 0), tf.float32)

# add extra dimensions to add the padding
# to the attention logits.
return seq[:, tf.newaxis, tf.newaxis, :]

and my look_ahead_mask fucntion:

def create_look_ahead_mask(sequence_length):
“”"
Returns an upper triangular matrix filled with ones

Arguments:
    sequence_length -- matrix size

Returns:
    mask -- (size, size) tensor
"""
mask = tf.linalg.band_part(tf.ones((sequence_length, sequence_length)), -1, 0)
return mask

and my scaled_dot_product function has passed all tests