C4W1_Assignment Exercise 4 - Translator

Hello.

I am having difficulty complete the Exercise 4 - Translator cell.

All previous blocks up to Exercise 3 passed unit test successfully and quick check of your implementation just below Exercise 4 - Translator block matches the Expected Output.

However when I execute w1_unittest.test_translator(Translator, Encoder, Decoder) cell, I get the following error messages which are raised by attn_output = …

The error message is:

InvalidArgumentError: Exception encountered when calling layer ‘key’ (type EinsumDense).

{{function_node _wrapped__Einsum_N_2_device/job:localhost/replica:0/task:0/device:GPU:0}} Expected dimension 512 at axis 0 of the input shaped [256,1,256] but got dimension 256 [Op:Einsum] name:

Call arguments received by layer ‘key’ (type EinsumDense):
• inputs=tf.Tensor(shape=(64, 15, 512), dtype=float32)

Can anyone give me some advice please?

Hi @sugaprho

The reason for this error is clearly dimension mismatch. If I had to guess, you probably hard-coded the 256 value instead of using units parameter in this or previous exercises?

Hi @arvyzukai, I really appreciate for your kind response.
I didn’t hard code values, but I used parameters in the constructor or call method.
However, I closely looked at what mistake I made and I finally figure out what was wrong in my source code.

To do so, I printed the shape of data layer by layer.

In the Decoder, the data shapes evolves as follows.

  1. Input (64,15)
  2. Embedding (64,15,256)
  3. LSTM (64,15,256)
  4. Attention (64,15,256)
  5. LSTM (64,256) → (64, 15, 256)
  6. Dense (64,12000) → (64, 15, 12000)

In my case, I made mistake at step 5. After I corrected some parameter, my source pass unit test.

Thank you.

1 Like

Hi,
can you share where was the issue? I am facing similar problem, my Decoder tests pass bu the Translator passes the “quick check” with proper output but fails on the unit tests. Looks to me that embedding layer in the decoder produces different output size Here’s the output of input parameters and dimensions in different steps:


Translator: 10000 512
Encoder: 10000, 512
Decoder: 10000, 512
Translator Context: (64, 15)
Translator Target: (64, 14)
Encoder context: (64, 15)

Encoder embedding: (64, 15, 512)

Encoder LSTM: (64, 15, 512)

Translator: Encoded context: (64, 15, 512)
Decoder input: (64, 15, 512) (64, 14)
Decoder Embedding: (64, 14, 256)
Decoder LSTM: (64, 14, 256)

Just for clarity my error is similar:

InvalidArgumentError: Exception encountered when calling layer 'key' (type EinsumDense).

{{function_node __wrapped__Einsum_N_2_device_/job:localhost/replica:0/task:0/device:GPU:0}} Expected dimension 512 at axis 0 of the input shaped [256,1,256] but got dimension 256 [Op:Einsum] name: 

Call arguments received by layer 'key' (type EinsumDense):
  • inputs=tf.Tensor(shape=(64, 19, 512), dtype=float32)

on calling self.mha in CrossAttention

Hi @Krzysztof_Jakubczyk

I cloud not answer for a while. I hope you already solve this problem.

Unfortunately, while I debug my code, I initialize the notebook file and I re-coded from the scratch.

In my experience, If you passed w1_unittest.test_cross_attention(CrossAttention) there is no problem in self.mha. For me, in the Decoder class, I sequentially pass the variable function by function because the given code is based on the functional api.

I don’t remember exactly, as far as I remember, I didn’t pass the variable x.

I hope your successful completion on the Exercise.

Regards,