C4W1_Assigment_Exercise 2 - CrossAttention

Hello! A have no idea what must be ‘key_dim’ =?
I tried a lot, usual result -ТypeError: (Keyword argument not understood:', ‘key_dim’)
May be problem not only in ‘key_dim’…
Will be happy with any advice.

Hi!
Make sure you are using the correct layer.
We have to use MultiHeadAttention layer and NOT Attention layer.

1 Like

Thank you for reply! I use “attn_output = tf.keras.layers.MultiHeadAttention(…” May be problem really in “key_dim”?

ah! I see where you made the mistake.
hint: you need to call the correct function with correct arguments in call that you defined in init constructor.

1 Like

Thank you!
What is example or correct function and correct arguments? This is not enough (tf.keras.layers.MultiHeadAttention  |  TensorFlow v2.14.0), where I can see more examples?

In class CrossAttention the __init__ constructor function initializes the class with mha function. The ‘call’ function calls this initialized function.
Currently you are calling the wrong function.

Secondly, to understand which arguments to pass in the calling function, check calling arguments of MultiHeadAttention

1 Like

I can not find any similar examples…
attn_output = tf.keras.layers.mha(
query=self.tf.keras.layers.Layer(query),
value=self.tf.keras.layers.Layer(value)
) _?

mha is not a keras layer. It is a function that we defined in __init__. User defined function can be called by using self as in self.mha.

Secondly, the arguments query and value would take in context/target Tensors instead of keras layer as explained in calling link

1 Like

I am sure that I need not only link about call function, but relevant example of code, but I can not find it. Tensorflow is not well known library for me

attn_output = self.mha(
query=context,
value=target
) -?

This is Object Oriented programming in python.
I like this article and this article as it gives plenty of examples and clear explanations.

As far as your code is concerned, you are now on right track and a little tweak away from the correct implementation. Revisit “NMT Model with Attention” video and pay “attention” to what goes where. In particular where is query and value originating from (text in green).

Finally, I would request you to take down the attempted solution code in previous posts as sharing code is against community guidelines.

Thanks!

1 Like