Issues with computing self_mha_ouput

Greetings
I am trying to implement the call() function of the EncoderLayer in C5_W4_A1_Transformer_Subclass_v1.

I don’t understand the comments in the following line:

# calculate self-attention using mha(~1 line).
# Dropout is added by Keras automatically if the dropout parameter is non-zero during training
self_mha_output = None # Self attention (batch_size, input_seq_len, fully_connected_dim)

When I pass x as input to as an argument to self.mha to compute self_mha_output, I get the following error. I am unable to understand what is going on. I read some of the other threads such as [C5_W4_A1_Transformer_Subclass_v1 UNQ4] are passing x as input three times. Could you kindly clarify? Thank you!

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-16-00617004b1af> in <module>
      1 # UNIT TEST
----> 2 EncoderLayer_test(EncoderLayer)

~/work/W4A1/public_tests.py in EncoderLayer_test(target)
     84     encoder_layer1 = target(4, 2, 8)
     85     tf.random.set_seed(10)
---> 86     encoded = encoder_layer1(q, True, np.array([[1, 0, 1]]))
     87 
     88     assert tf.is_tensor(encoded), "Wrong type. Output must be a tensor"

/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/engine/base_layer.py in __call__(self, *args, **kwargs)
   1010         with autocast_variable.enable_auto_cast_variables(
   1011             self._compute_dtype_object):
-> 1012           outputs = call_fn(inputs, *args, **kwargs)
   1013 
   1014         if self._activity_regularizer:

<ipython-input-15-1ce9760f1868> in call(self, x, training, mask)
     40         # calculate self-attention using mha(~1 line).
     41         # Dropout is added by Keras automatically if the dropout parameter is non-zero during training
---> 42         self_mha_output = self.mha(x)  # Self attention (batch_size, input_seq_len, fully_connected_dim)
     43 
     44         # skip connection

/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/engine/base_layer.py in __call__(self, *args, **kwargs)
   1010         with autocast_variable.enable_auto_cast_variables(
   1011             self._compute_dtype_object):
-> 1012           outputs = call_fn(inputs, *args, **kwargs)
   1013 
   1014         if self._activity_regularizer:

TypeError: call() missing 1 required positional argument: 'value'

The self.mha() function requires that you pass four paramters.

  • the query
  • the key
  • the value
  • a mask.
    You’ve only passed one of those.

Use x for the first three.

For self-attention, the query, key, and value are all the same data.

I don’t know exactly why they’re called query, key, and value. That’s just how it works. It’s entirely non-intuitive.

When I try to pass a mask (mask=mask) as an argument, I receive this error message:TypeError: call() got an unexpected keyword argument ‘mask’

does this suggest I shouldn’t be passing a mask as an argument?

Could be that the “mask=” part is a problem. Do you know whether “mask =” is a valid parameter?

@Ahusu

Could you share a screenshot of the type error you’re encountering, for better understanding of your issue.

Regards
DP

For those who are still struggling to understand the appropriate parameter to use for ‘mask’, it is helpful to read the instructions before starting the exercise by clicking on ‘Additional hints’.

Since when do we pass a mask in the MHA? I don’t remember the lecture covering it. The only part, if I recall correctly, where mask is mentioned in the decoder where it was mentioned that the Multi-Head Attention can also be called Masked Multi-Head Attention.

Honestly, assignment instructions could be so much clearer.

Yes, of all of the assignments in the Deep Learning Specialization, this one has the highest expectations of the students.

Please make the instructions clearer. Not everyone here has Stanford level intellect.

Noted.