Hi,
I am having some trouble with Exercise4: Translator. While previous exercises pass and the basic one for the Exercise 4 does as well it fails on unit tests with the following error:
37 # Call the MH attention by passing in the query and value
38 # For this case the query should be the translation and the value the encoded sentence to translate
39 # Hint: Check the call arguments of MultiHeadAttention in the docs
---> 40 attn_output = self.mha(
41 query=target,
42 value=context
43 )
45 ### END CODE HERE ###
47 x = self.add([target, attn_output])
InvalidArgumentError: Exception encountered when calling layer 'key' (type EinsumDense).
{{function_node __wrapped__Einsum_N_2_device_/job:localhost/replica:0/task:0/device:GPU:0}} Expected dimension 512 at axis 0 of the input shaped [256,1,256] but got dimension 256 [Op:Einsum] name:
Call arguments received by layer 'key' (type EinsumDense):
• inputs=tf.Tensor(shape=(64, 19, 512), dtype=float32)
I am not hardcoding the 256 anywhere, just using the units
parameter.
Here are the dimensions in consequent steps:
Translator: 10000 512
Encoder: 10000, 512
Decoder: 10000, 512
Translator Context: (64, 19)
Translator Target: (64, 17)
Encoder context: (64, 19)
Encoder embedding: (64, 19, 512)
Encoder LSTM: (64, 19, 512)
Translator: Encoded context: (64, 19, 512)
Decoder input: (64, 19, 512) (64, 17)
Decoder Embedding: (64, 17, 256)
Decoder LSTM: (64, 17, 256)
any ideas where is the issue?