C5 W4 A5: Encoder

Hi All,
I have the same issue as described in

“AssertionError: Wrong values case 1”
but now I’m running out of ideas since the description is by far not sufficient to solve this issue.

Who had the same problems and can discuss this issue with me? THX!

And don’t post your solution ;-).

Hi @Michael_J_Schulz ,

After the dropout, did you pass the output through the stack of encoding layers?

Juan.

1 Like

One more thing: Can you please copy/paste here the entire text of the assertion, including the values you are getting?

If I skip the last instruction, meaning the “# Pass the encoded embedding through a dropout layer”, this is the error I would get:


AssertionError Traceback (most recent call last)
in
1 # UNIT TEST
----> 2 Encoder_test(Encoder)

~/work/W4A1/public_tests.py in Encoder_test(target)
124 [[-0.4612937 , 1.0697356 , -1.4127715 , 0.8043293 ],
125 [ 0.27027237, 0.28793618, -1.6370889 , 1.0788803 ],
→ 126 [ 1.2370994 , -1.0687275 , -0.8945037 , 0.7261319 ]]]), “Wrong values case 1”
127
128 encoderq_output = encoderq(x, True, np.array([[[[1., 1., 1.]]], [[[1., 1., 0.]]]]))

AssertionError: Wrong values case 1

1 Like

Hi @Juan_Olano,
I did:
for i in range(self.num_layers):
x = self.enc_layers[i](x,training, mask)

And to make it a little bit more thrilling here: I get the same mistake as you, when I drop the last line.

Can somebody help me from this side of the Atlantic Ocean while @Juan_Olano is offline
:wink: ?

Hello Michael (@Michael_J_Schulz),
If you message me your code for “unq_c5 Encoder” part, I can take a look at it.
(I am the one who had posted the mentioned issue.)

1 Like

Dear Hasan,
I send it to you!

Thank you for your help!

I have put in some print statements, but this should not change anything.

Good morning! I know you sent your code already but please, dont let me hanging with this curiosity of what is going on with your code :slight_smile:

Would you send it to me as well just to learn what is going on?

Thanks!

Juan

2 Likes

Good Morning Juan,

I will sent you the code for Encoder Layer and Encoder, I have the suspicion that perhaps the bug lies not in the Encoder alone.

The additional print commands did not really help me, but perhaps they gave you some hints.

I had some battles here in the Specialisation, but this is also new for me.

Thank you for your support,
Michael

ENCODER_LAYER:
def call(self, x, training, mask):
“”"
Forward pass for the Encoder Layer

    Arguments:
        x -- Tensor of shape (batch_size, input_seq_len, fully_connected_dim)
        training -- Boolean, set to true to activate
                    the training mode for dropout layers
        mask -- Boolean mask to ensure that the padding is not 
                treated as part of the input
    Returns:
        encoder_layer_out -- Tensor of shape (batch_size, input_seq_len, embedding_dim)
    """
    # START CODE HERE
    # calculate self-attention using mha(~1 line).
    # Dropout is added by Keras automatically if the dropout parameter is non-zero during training
    attn_output = self.mha(x, x, training=training, attention_mask= mask) # Self attention (batch_size, input_seq_len, fully_connected_dim)

    # apply layer normalization on sum of the input and the attention output to get the  
    # output of the multi-head attention layer (~1 line)
    out1 = self.layernorm1(tf.keras.layers.add([x, attn_output]))  # (batch_size, input_seq_len, fully_connected_dim)
    
    # pass the output of the multi-head attention layer through a ffn (~1 line)
    ffn_output = self.ffn(out1)  # (batch_size, input_seq_len, fully_connected_dim)
    
    # apply dropout layer to ffn output during training (~1 line)
    ffn_output =  self.dropout_ffn(ffn_output, training=training)
    
    # apply layer normalization on sum of the output from multi-head attention and ffn output to get the
    # output of the encoder layer (~1 line)
    encoder_layer_out = self.layernorm2(tf.keras.layers.add([out1, ffn_output]))  # (batch_size, input_seq_len, embedding_dim)
    # END CODE HERE
    
    return encoder_layer_out

ENCODER with additional print-commands
def call(self, x, training, mask):
“”"
Forward pass for the Encoder

    Arguments:
        x -- Tensor of shape (batch_size, input_seq_len)
        training -- Boolean, set to true to activate
                    the training mode for dropout layers
        mask -- Boolean mask to ensure that the padding is not 
                treated as part of the input
    Returns:
        out2 -- Tensor of shape (batch_size, input_seq_len, embedding_dim)
    """
    #mask = create_padding_mask(x)
    seq_len = tf.shape(x)[1]
    
    # START CODE HERE
    # Pass input through the Embedding layer
    x = self.embedding(x)  # (batch_size, input_seq_len, embedding_dim)
    print (x)
    
    # Scale embedding by multiplying it by the square root of the embedding dimension
    x *= tf.math.sqrt(tf.cast(self.embedding_dim, tf.float32))
    print (x)
    
    # Add the position encoding to embedding
    #x += self.pos_encoding[0, :x.shape[1], : ]
    x += self.pos_encoding[:, :seq_len, :]
    print ("After position encoding = ", x)
    
    # Pass the encoded embedding through a dropout layer
    x = self.dropout(x, training)
    print (x)
    
    print("num_layers = ", self.num_layers)
    for i in range(self.num_layers):
        print("i = ", i)
        print("x = ", x)
        print(self.enc_layers[i](x,training, mask))
    # Pass the output through the stack of encoding layers 
    for i in range(self.num_layers):
        x = self.enc_layers[i](x,training, mask)
    # END CODE HERE

    return x  # (batch_size, input_seq_len, embedding_dim)

Thank you @Michael_J_Schulz ! I’ll look at this and provide hints wherever I find an issue.

I myself also had my battles in the specialization, as it had been ages sin I had worked algebra and calculus :slight_smile:

1 Like

@Michael_J_Schulz ,

I’ve checked both routines. I think your hunch on EncoderLayer is right. These are my initial hints:

Check the following:

In EncoderLayer:
attn_output - check the params passed to self.mha
out1 - check the params passed to self.layernorm1
encoder_layer_out - you can simplify the params passed to self.layernorm2

In Encoder:

Scale embedding by multiplying it by the square root of the embedding dimension
Your x *= … works just fine. I just want to show you another way to do it: x *= np.sqrt(self.embedding_dim)

So basically I think you should look at the 3 lines in the EncoderLayer described above.

Lets start with this level of hint - if you still have issues, I will provide another level of hints.

Juan

1 Like

I have to tell you, that I have adapted attn_output following the recommendations from @Hasan_Resul_Cesur. Before I had
attn_output = self.mha(x, x, x, mask) and this worked out as well (leading to themistake at the end of Encoder).

The puzzle about out1 I will try to solve.

The main problem with these exercises is that you don’t have any chance to find it out by yourself, since you don’t have enough test data. This is not meant for you, but for the organisation behind it.

@Michael_J_Schulz regarding out1, check out this part of the instructions:

“apply layer normalization on sum of the input and the attention output”

This phrase says exactly what you have to do. You are almost there. Try changing the tf.keras.layers.add function with a much simpler approach, like…

…spoiler alert… .stop here if no hint wanted…
.
.
.
.
.
.
.
.
a+b :slight_smile:

1 Like

:sob: :sob: :sob:

IT WORKED!

:grinning:, btw, I’m 56, but here I feel like in university again.

And another BTW: I used the normal addition with + in the beginning, but failed. Now I assume, I failed bc of other reasons.

Thank you, you saved my day, week and perhaps my certificate! I don’t have so much time left before I start to work again…

I will try to give you some likes or whtever is possible here.

Best Regards,
Michael

1 Like

I am super glad that it worked!!!

I am 57. And yes, I feel like in university again as well :slight_smile:

Lets go for that certificate!

Juan

I have finished the certificate, it was 2 AM this morning when I saw the last “All tests passed”.

Tuesday my new job starts, so it was just on time ;-).

Thank you again!

1 Like