Programming Assignment: Transformers Architecture with TensorFlow encoderlayer

I am confused with the input of multihead attention,
It should be Q, K, V right?
self.mha = MultiHeadAttention(num_heads=num_heads,
in the programming assignment,
exercise 4
class EncoderLayer
def call(self, x, training, mask):

the input of function call only contains x,
what are the values of input for self.mha ?
i wrote self_mha_output = self.mha(x, return_attention_scores=True, training=True)

and got error like TypeError: call() missing 1 required positional argument: ‘value’
Sorry I am still not quite understand all details of encoder-decoder architecture,
can anyone help me?


Hi @maxma ,
As a hint, self.mha() expects at least 4 arguments, one for each of Q, K and V and also the mask. In this specific case, x represents Q, K and V. This is because in self attention in the encoder, Q,K and V are the same.

So, just to be clear, you have to use ‘x’ three times.