C4W1_Assignment Exercise 4 - Translator

sugaprho · January 11, 2024, 8:35am

Hello.

I am having difficulty complete the Exercise 4 - Translator cell.

All previous blocks up to Exercise 3 passed unit test successfully and quick check of your implementation just below Exercise 4 - Translator block matches the Expected Output.

However when I execute w1_unittest.test_translator(Translator, Encoder, Decoder) cell, I get the following error messages which are raised by attn_output = …

The error message is:

InvalidArgumentError: Exception encountered when calling layer ‘key’ (type EinsumDense).

{{function_node _wrapped__Einsum_N_2_device/job:localhost/replica:0/task:0/device:GPU:0}} Expected dimension 512 at axis 0 of the input shaped [256,1,256] but got dimension 256 [Op:Einsum] name:

Call arguments received by layer ‘key’ (type EinsumDense):
• inputs=tf.Tensor(shape=(64, 15, 512), dtype=float32)

Can anyone give me some advice please?

arvyzukai · January 11, 2024, 10:07am

Hi @sugaprho

The reason for this error is clearly dimension mismatch. If I had to guess, you probably hard-coded the 256 value instead of using units parameter in this or previous exercises?

sugaprho · January 21, 2024, 2:19am

Hi @arvyzukai, I really appreciate for your kind response.
I didn’t hard code values, but I used parameters in the constructor or call method.
However, I closely looked at what mistake I made and I finally figure out what was wrong in my source code.

To do so, I printed the shape of data layer by layer.

In the Decoder, the data shapes evolves as follows.

Input (64,15)
Embedding (64,15,256)
LSTM (64,15,256)
Attention (64,15,256)
LSTM ~~(64,256)~~ → (64, 15, 256)
Dense ~~(64,12000)~~ → (64, 15, 12000)

In my case, I made mistake at step 5. After I corrected some parameter, my source pass unit test.

Thank you.

Krzysztof_Jakubczyk · January 21, 2024, 4:31pm

Hi,
can you share where was the issue? I am facing similar problem, my Decoder tests pass bu the Translator passes the “quick check” with proper output but fails on the unit tests. Looks to me that embedding layer in the decoder produces different output size Here’s the output of input parameters and dimensions in different steps:


Translator: 10000 512
Encoder: 10000, 512
Decoder: 10000, 512
Translator Context: (64, 15)
Translator Target: (64, 14)
Encoder context: (64, 15)

Encoder embedding: (64, 15, 512)

Encoder LSTM: (64, 15, 512)

Translator: Encoded context: (64, 15, 512)
Decoder input: (64, 15, 512) (64, 14)
Decoder Embedding: (64, 14, 256)
Decoder LSTM: (64, 14, 256)

Krzysztof_Jakubczyk · January 21, 2024, 6:12pm

Just for clarity my error is similar:

InvalidArgumentError: Exception encountered when calling layer 'key' (type EinsumDense).

{{function_node __wrapped__Einsum_N_2_device_/job:localhost/replica:0/task:0/device:GPU:0}} Expected dimension 512 at axis 0 of the input shaped [256,1,256] but got dimension 256 [Op:Einsum] name: 

Call arguments received by layer 'key' (type EinsumDense):
  • inputs=tf.Tensor(shape=(64, 19, 512), dtype=float32)

on calling self.mha in CrossAttention

sugaprho · January 25, 2024, 11:04am

Hi @Krzysztof_Jakubczyk

I cloud not answer for a while. I hope you already solve this problem.

Unfortunately, while I debug my code, I initialize the notebook file and I re-coded from the scratch.

In my experience, If you passed w1_unittest.test_cross_attention(CrossAttention) there is no problem in self.mha. For me, in the Decoder class, I sequentially pass the variable function by function because the given code is based on the functional api.

I don’t remember exactly, as far as I remember, I didn’t pass the variable x.

I hope your successful completion on the Exercise.

Regards,

Topic		Replies	Views
C4W1 dimensional error in Excercise4 Translator unit tests NLP with Attention Models week-1	1	451	January 22, 2024
Test crash when jointly testing translator, endcoder, decoder NLP with Attention Models general	6	62	July 25, 2024
C4W1_Assignment Translator unit test NLP with Attention Models week-1	19	501	February 9, 2025
C4W1: Exercise 4 - Translator NLP with Attention Models week-1	2	70	July 18, 2024
C4W1 Assignment - Exercise 3 Decoder Function NLP with Attention Models week-1	6	357	May 24, 2024

C4W1_Assignment Exercise 4 - Translator

Related topics