The first instruction says:

- You will pass the Q, V, K matrices and a boolean mask to a multi-head attention layer. Remember that to compute
*self*-attention Q, V and K should be the same.

But how do i get Q, V, K matrices. they are not included in the def call(self, x,…) parameters?

I keep running self_attn_output = self.mha(…) but i keep getting error messages.

Any help?