Dimensional size error for C4W1_Assignment Decoder test

Zhiyi_Li2 · January 18, 2024, 5:23am

For Exercise 3 Decoder Test, I got:

Tensor of contexts has shape: (64, 14, 256) Tensor of right-shifted translations has shape: (64, 15) Tensor of logits has shape: (64, 256)

Expected Output

Tensor of contexts has shape: (64, 14, 256)
Tensor of right-shifted translations has shape: (64, 15)
Tensor of logits has shape: (64, 15, 12000)

Someone can help me to fix this dimension error?
The Exercise 3 Decoder part snippet of code is:

The dense layer with logsoftmax activation

    self.output_layer = tf.keras.layers.Dense(
        units=units,
        activation= tf.nn.log_softmax
    )

LevValenzuela · January 18, 2024, 5:57am

Hi, Zhiyi_Li2

If I remember correctly, I encountered a similar issue. The problem was related to the input of the embedding layer (edit: or output layer). Perhaps the issue is related to the input you are using for that layer’s parameter.

I hope it proves helpful. Regards

Zhiyi_Li2 · January 18, 2024, 6:08am

Thanks, part of code snippet is:

x = context
y = self.embedding(target)

x, hidden_state, cell_state = self.pre_attention_rnn(x, initial_state=None)

x = self.attention(x, y)

x = self.post_attention_rnn(x)

Not sure which step is wrong.

LevValenzuela · January 18, 2024, 6:15am

You are mixing up x and y. Moreover, y is not present in the assignment. The line “x = context” is not part of the task either. Similarly, in the assigment the result of self.embedding is stored in x, not y.

Zhiyi_Li2 · January 18, 2024, 8:35pm

Got it: It. I revised my code snippet, the output dimension still not correct:
ensor of contexts has shape: (64, 18, 256) Tensor of right-shifted translations has shape: (64, 14) Tensor of logits has shape: (64, 256)

Expected Output

Tensor of contexts has shape: (64, 14, 256)
Tensor of right-shifted translations has shape: (64, 15)
Tensor of logits has shape: (64, 15, 12000)

Here is my code snippet:

      x = self.embedding(target)

    # Pass the embedded input into the pre attention LSTM
    # Hints:
    # - The LSTM you defined earlier should return the output alongside the state (made up of two tensors)
    # - Pass in the state to the LSTM (needed for inference)
    x, hidden_state, cell_state = self.pre_attention_rnn(x, initial_state=None)

    # Perform cross attention between the context and the output of the LSTM (in that order)
    x = self.attention(context, x) # 

    # Do a pass through the post attention LSTM
    x = self.post_attention_rnn(x)

    # Compute the logits
    logits = self.output_layer(x)

Zhiyi_Li2 · January 18, 2024, 10:04pm

I changed the output layer as:

self.output_layer = tf.keras.layers.Dense(
units =vocab_size,
activation= tf.nn.log_softmax
)
The output looks like:

Tensor of contexts has shape: (64, 17, 256) Tensor of right-shifted translations has shape: (64, 21) Tensor of logits has shape: (64, 12000)

Expected Output

Tensor of contexts has shape: (64, 14, 256)
Tensor of right-shifted translations has shape: (64, 15)
Tensor of logits has shape: (64, 15, 12000)

Still not correct.

Zhiyi_Li2 · January 18, 2024, 10:59pm

I think problem is caused by post_attention_rnn,

Here is my code snippet:

The RNN after attention

    self.post_attention_rnn = tf.keras.layers.LSTM(
        units=units,
        return_sequences=False
    )

Output size is not correct.

LevValenzuela · January 18, 2024, 11:18pm

Hi Zhiyi_Li2,

Your code appears to be in good shape. However, there is a policy regarding displaying code here.

The issue might be in another layer (maybe in the embedding with the parameters). Typically, @arvyzukai and gent.spah are the individuals who assist me with my questions.

If you’d like to get in touch with them, I’m sure they will help you with your issue.

Best regards.

Zhiyi_Li2 · January 19, 2024, 2:54am

Thanks, you may be right. I think just a simple dimension issue. Wait for NLP master jump in to help. I will obey code policy not show the code next time.

Zhiyi_Li2 · January 19, 2024, 3:08am

@arvyzukai Can you help to see what is wrong for this post_attention_rnn:

self.post_attention_rnn = tf.keras.layers.LSTM(
units=units,
return_sequences=False
)

Output size id (64, 256) correct one should be (64, 15, 256)

arvyzukai · January 19, 2024, 6:11am

Hi @Zhiyi_Li2

There is nothing wrong with your post_attention_rnn.

Looking at the dimensions it seams that you lost sequence dimension somewhere. In other words, the problem should lie in your call() implementation. Please pay close attention to code hints and also instructions.

Let me know if you find any of them confusing.
Cheers

Zhiyi_Li2 · January 19, 2024, 8:44pm

I tested by print out before and after post attention function:

The RNN after attention

    self.post_attention_rnn = tf.keras.layers.LSTM(
        units=units,
        return_sequences=False
    )

Output:
x.shape after attention: (64, 14, 256)
x.shape post attention: (64, 256)

Tensor of right-shifted translations has shape: (64, 14)
Tensor of logits has shape: (64, 12000)

Expected Output

Tensor of contexts has shape: (64, 14, 256)
Tensor of right-shifted translations has shape: (64, 15)
Tensor of logits has shape: (64, 15, 12000)

For right-shifted tranlators I did code snippet like:

target_emb = self.embedding(target)
# Pass a batch of sentences to translate from english to portugues
# encoder(to_translate)
x = context
# Pass the embedded input into the pre attention LSTM
# Hints:
# - The LSTM you defined earlier should return the output alongside the state (made up of two tensors)
# - Pass in the state to the LSTM (needed for inference)
target_x, hidden_state, cell_state = self.pre_attention_rnn(target_emb, initial_state=state)

      # Perform cross attention between the context and the output of the LSTM (in that order)
    x = self.attention(x, target_x) #

Something wrong for handle target ? I noticed the dimension should be 15 instead of 14.

Something wrong on target operations

Zhiyi_Li2 · January 19, 2024, 9:38pm

Move one step further: After i changed the code as:

self.post_attention_rnn = tf.keras.layers.LSTM(
units=units,
return_sequences=True,
return_state=False
)
The output sequence is much better:
Tensor of contexts has shape: (64, 15, 256) Tensor of right-shifted translations has shape: (64, 14) Tensor of logits has shape: (64, 14, 12000)

Expected Output

Tensor of contexts has shape: (64, 14, 256)
Tensor of right-shifted translations has shape: (64, 15)
Tensor of logits has shape: (64, 15, 12000)

But I noticed still some not aligned: should 15 instead of 14 in dimension size.
Any idea ?

Zhiyi_Li2 · January 19, 2024, 9:59pm

Never mind, all test passed, I think the system has problem on Exercise 4.

Zhiyi_Li2 · January 19, 2024, 10:01pm

Correct: Exercise 3.

arvyzukai · January 22, 2024, 6:59am

Hi @Zhiyi_Li2

Have you passed the Assignment?

The reason for 14 and 15 mismatch should be the use of context (14) and target (15). So make sure you embedd the target in the Exercise 5 decoder (not the context).

Cheers

Wolfgang_Kienreich · January 28, 2024, 12:31pm

Simiar problem here. Solved. Assignment doc states “Post-attention lstm. Another LSTM layer. For this one you don’t need it to return the state.” True in theory, state information not required. However, return_sequences needs to be True for post_attention_rnn in order to pass checks and tests in assignment (my model is training now…)

Xi_Chen5 · February 20, 2024, 3:27am

It doesn’t matter, if you run all the code again, it could become 13,14,15.

Burak_Bakkar · April 15, 2024, 4:18pm

Hello, i get this error when testing the Exercise 3 - Decoder
Failed test case: Incorrect third dimension of decoder output.
Expected: 6
Got: 4

For the quick check i get this:

Actual output:
Tensor of contexts has shape: (64, 18, 256) Tensor of right-shifted translations has shape: (64, 16) Tensor of logits has shape: (64, 16, 256)

Expected Output

Tensor of contexts has shape: (64, 14, 256)
Tensor of right-shifted translations has shape: (64, 15)
Tensor of logits has shape: (64, 15, 12000)

Can someone help me please ?

Erika_Hall · April 26, 2024, 2:39am

I also get this error when testing the Exercise 3 - Decoder
Failed test case: Incorrect third dimension of decoder output.
Expected: 6
Got: 4

Has anyone found the solution?

Topic		Replies	Views
C4W1_Assignment : Tensor of logits dimension mismatch NLP with Attention Models week-1	9	360	June 15, 2024
C4W1 Assignment - Exercise 3 Decoder Function NLP with Attention Models week-1	6	357	May 24, 2024
NLP C4W1 Exercise 3 Decoder NLP with Attention Models week-1	2	452	January 12, 2024
C4W1 - Ex3 - Incorrect third dimension of decoder output NLP with Attention Models week-1	1	299	February 29, 2024
Support with C4W1 assignment - NLP with attention models NLP with Attention Models feedback , week-1	2	207	May 31, 2024

Dimensional size error for C4W1_Assignment Decoder test

Expected Output

The dense layer with logsoftmax activation

Expected Output

Expected Output

The RNN after attention

The RNN after attention

Expected Output

Expected Output

Expected Output

Related topics