C5 W4 A5: Encoder

Michael_J_Schulz · November 10, 2022, 12:15am

Hi All,
I have the same issue as described in

“AssertionError: Wrong values case 1”
but now I’m running out of ideas since the description is by far not sufficient to solve this issue.

Who had the same problems and can discuss this issue with me? THX!

And don’t post your solution ;-).

Juan_Olano · November 10, 2022, 1:50am

Hi @Michael_J_Schulz ,

After the dropout, did you pass the output through the stack of encoding layers?

Juan.

Juan_Olano · November 10, 2022, 1:58am

One more thing: Can you please copy/paste here the entire text of the assertion, including the values you are getting?

If I skip the last instruction, meaning the “# Pass the encoded embedding through a dropout layer”, this is the error I would get:

AssertionError Traceback (most recent call last)
in
1 # UNIT TEST
----> 2 Encoder_test(Encoder)

~/work/W4A1/public_tests.py in Encoder_test(target)
124 [[-0.4612937 , 1.0697356 , -1.4127715 , 0.8043293 ],
125 [ 0.27027237, 0.28793618, -1.6370889 , 1.0788803 ],
→ 126 [ 1.2370994 , -1.0687275 , -0.8945037 , 0.7261319 ]]]), “Wrong values case 1”
127
128 encoderq_output = encoderq(x, True, np.array([[[[1., 1., 1.]]], [[[1., 1., 0.]]]]))

AssertionError: Wrong values case 1

Michael_J_Schulz · November 10, 2022, 8:58am

Hi @Juan_Olano,
I did:
for i in range(self.num_layers):
x = self.enc_layers[i](x,training, mask)

And to make it a little bit more thrilling here: I get the same mistake as you, when I drop the last line.

Michael_J_Schulz · November 10, 2022, 9:00am

Can somebody help me from this side of the Atlantic Ocean while @Juan_Olano is offline
?

Hasan_Resul_Cesur · November 10, 2022, 10:15am

Hello Michael (@Michael_J_Schulz),
If you message me your code for “unq_c5 Encoder” part, I can take a look at it.
(I am the one who had posted the mentioned issue.)

Michael_J_Schulz · November 10, 2022, 11:09am

Dear Hasan,
I send it to you!

Thank you for your help!

I have put in some print statements, but this should not change anything.

Juan_Olano · November 10, 2022, 12:01pm

Good morning! I know you sent your code already but please, dont let me hanging with this curiosity of what is going on with your code

Would you send it to me as well just to learn what is going on?

Thanks!

Juan

Michael_J_Schulz · November 10, 2022, 2:48pm

Good Morning Juan,

I will sent you the code for Encoder Layer and Encoder, I have the suspicion that perhaps the bug lies not in the Encoder alone.

The additional print commands did not really help me, but perhaps they gave you some hints.

I had some battles here in the Specialisation, but this is also new for me.

Thank you for your support,
Michael

ENCODER_LAYER:
def call(self, x, training, mask):
“”"
Forward pass for the Encoder Layer

    Arguments:
        x -- Tensor of shape (batch_size, input_seq_len, fully_connected_dim)
        training -- Boolean, set to true to activate
                    the training mode for dropout layers
        mask -- Boolean mask to ensure that the padding is not 
                treated as part of the input
    Returns:
        encoder_layer_out -- Tensor of shape (batch_size, input_seq_len, embedding_dim)
    """
    # START CODE HERE
    # calculate self-attention using mha(~1 line).
    # Dropout is added by Keras automatically if the dropout parameter is non-zero during training
    attn_output = self.mha(x, x, training=training, attention_mask= mask) # Self attention (batch_size, input_seq_len, fully_connected_dim)

    # apply layer normalization on sum of the input and the attention output to get the  
    # output of the multi-head attention layer (~1 line)
    out1 = self.layernorm1(tf.keras.layers.add([x, attn_output]))  # (batch_size, input_seq_len, fully_connected_dim)
    
    # pass the output of the multi-head attention layer through a ffn (~1 line)
    ffn_output = self.ffn(out1)  # (batch_size, input_seq_len, fully_connected_dim)
    
    # apply dropout layer to ffn output during training (~1 line)
    ffn_output =  self.dropout_ffn(ffn_output, training=training)
    
    # apply layer normalization on sum of the output from multi-head attention and ffn output to get the
    # output of the encoder layer (~1 line)
    encoder_layer_out = self.layernorm2(tf.keras.layers.add([out1, ffn_output]))  # (batch_size, input_seq_len, embedding_dim)
    # END CODE HERE
    
    return encoder_layer_out

ENCODER with additional print-commands
def call(self, x, training, mask):
“”"
Forward pass for the Encoder

    Arguments:
        x -- Tensor of shape (batch_size, input_seq_len)
        training -- Boolean, set to true to activate
                    the training mode for dropout layers
        mask -- Boolean mask to ensure that the padding is not 
                treated as part of the input
    Returns:
        out2 -- Tensor of shape (batch_size, input_seq_len, embedding_dim)
    """
    #mask = create_padding_mask(x)
    seq_len = tf.shape(x)[1]
    
    # START CODE HERE
    # Pass input through the Embedding layer
    x = self.embedding(x)  # (batch_size, input_seq_len, embedding_dim)
    print (x)
    
    # Scale embedding by multiplying it by the square root of the embedding dimension
    x *= tf.math.sqrt(tf.cast(self.embedding_dim, tf.float32))
    print (x)
    
    # Add the position encoding to embedding
    #x += self.pos_encoding[0, :x.shape[1], : ]
    x += self.pos_encoding[:, :seq_len, :]
    print ("After position encoding = ", x)
    
    # Pass the encoded embedding through a dropout layer
    x = self.dropout(x, training)
    print (x)
    
    print("num_layers = ", self.num_layers)
    for i in range(self.num_layers):
        print("i = ", i)
        print("x = ", x)
        print(self.enc_layers[i](x,training, mask))
    # Pass the output through the stack of encoding layers 
    for i in range(self.num_layers):
        x = self.enc_layers[i](x,training, mask)
    # END CODE HERE

    return x  # (batch_size, input_seq_len, embedding_dim)

Juan_Olano · November 10, 2022, 2:52pm

Thank you @Michael_J_Schulz ! I’ll look at this and provide hints wherever I find an issue.

I myself also had my battles in the specialization, as it had been ages sin I had worked algebra and calculus

Juan_Olano · November 10, 2022, 3:09pm

@Michael_J_Schulz ,

I’ve checked both routines. I think your hunch on EncoderLayer is right. These are my initial hints:

Check the following:

In EncoderLayer:
attn_output - check the params passed to self.mha
out1 - check the params passed to self.layernorm1
encoder_layer_out - you can simplify the params passed to self.layernorm2

In Encoder:

Scale embedding by multiplying it by the square root of the embedding dimension
Your x *= … works just fine. I just want to show you another way to do it: x *= np.sqrt(self.embedding_dim)

So basically I think you should look at the 3 lines in the EncoderLayer described above.

Lets start with this level of hint - if you still have issues, I will provide another level of hints.

Juan

Michael_J_Schulz · November 10, 2022, 4:57pm

I have to tell you, that I have adapted attn_output following the recommendations from @Hasan_Resul_Cesur. Before I had
attn_output = self.mha(x, x, x, mask) and this worked out as well (leading to themistake at the end of Encoder).

The puzzle about out1 I will try to solve.

The main problem with these exercises is that you don’t have any chance to find it out by yourself, since you don’t have enough test data. This is not meant for you, but for the organisation behind it.

Juan_Olano · November 10, 2022, 5:51pm

@Michael_J_Schulz regarding out1, check out this part of the instructions:

“apply layer normalization on sum of the input and the attention output”

This phrase says exactly what you have to do. You are almost there. Try changing the tf.keras.layers.add function with a much simpler approach, like…

…spoiler alert… .stop here if no hint wanted…
.
.
.
.
.
.
.
.
a+b

Michael_J_Schulz · November 10, 2022, 7:42pm

IT WORKED!

, btw, I’m 56, but here I feel like in university again.

And another BTW: I used the normal addition with + in the beginning, but failed. Now I assume, I failed bc of other reasons.

Thank you, you saved my day, week and perhaps my certificate! I don’t have so much time left before I start to work again…

I will try to give you some likes or whtever is possible here.

Best Regards,
Michael

Juan_Olano · November 10, 2022, 7:51pm

I am super glad that it worked!!!

I am 57. And yes, I feel like in university again as well

Lets go for that certificate!

Juan

Michael_J_Schulz · November 11, 2022, 8:21am

I have finished the certificate, it was 2 AM this morning when I saw the last “All tests passed”.

Tuesday my new job starts, so it was just on time ;-).

Thank you again!

Topic		Replies	Views
C5 W4 class Encoder Sequence Models	4	816	October 25, 2021
C5W4A1 UNQ_C5 Encoder - AssertionError: Wrong values case 1 Sequence Models	5	884	September 10, 2021
C5W4 Encoder Error Sequence Models	2	584	March 17, 2022
C5_W4_A1_Transformer_Subclass_v1_Encoder_UNQ_C5 Sequence Models	6	514	June 11, 2023
W4a1 unq_c5 Encoder Sequence Models	6	622	April 8, 2025

C5 W4 A5: Encoder

Related topics