Issues with computing self_mha_ouput

David00 · June 4, 2023, 4:43pm

Greetings
I am trying to implement the call() function of the EncoderLayer in C5_W4_A1_Transformer_Subclass_v1.

I don’t understand the comments in the following line:

# calculate self-attention using mha(~1 line).
# Dropout is added by Keras automatically if the dropout parameter is non-zero during training
self_mha_output = None # Self attention (batch_size, input_seq_len, fully_connected_dim)

When I pass x as input to as an argument to self.mha to compute self_mha_output, I get the following error. I am unable to understand what is going on. I read some of the other threads such as [C5_W4_A1_Transformer_Subclass_v1 UNQ4] are passing x as input three times. Could you kindly clarify? Thank you!

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-16-00617004b1af> in <module>
      1 # UNIT TEST
----> 2 EncoderLayer_test(EncoderLayer)

~/work/W4A1/public_tests.py in EncoderLayer_test(target)
     84     encoder_layer1 = target(4, 2, 8)
     85     tf.random.set_seed(10)
---> 86     encoded = encoder_layer1(q, True, np.array([[1, 0, 1]]))
     87 
     88     assert tf.is_tensor(encoded), "Wrong type. Output must be a tensor"

/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/engine/base_layer.py in __call__(self, *args, **kwargs)
   1010         with autocast_variable.enable_auto_cast_variables(
   1011             self._compute_dtype_object):
-> 1012           outputs = call_fn(inputs, *args, **kwargs)
   1013 
   1014         if self._activity_regularizer:

<ipython-input-15-1ce9760f1868> in call(self, x, training, mask)
     40         # calculate self-attention using mha(~1 line).
     41         # Dropout is added by Keras automatically if the dropout parameter is non-zero during training
---> 42         self_mha_output = self.mha(x)  # Self attention (batch_size, input_seq_len, fully_connected_dim)
     43 
     44         # skip connection

/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/engine/base_layer.py in __call__(self, *args, **kwargs)
   1010         with autocast_variable.enable_auto_cast_variables(
   1011             self._compute_dtype_object):
-> 1012           outputs = call_fn(inputs, *args, **kwargs)
   1013 
   1014         if self._activity_regularizer:

TypeError: call() missing 1 required positional argument: 'value'

TMosh · June 4, 2023, 5:14pm

The self.mha() function requires that you pass four paramters.

the query
the key
the value
a mask.
You’ve only passed one of those.

Use x for the first three.

TMosh · June 4, 2023, 5:17pm

For self-attention, the query, key, and value are all the same data.

I don’t know exactly why they’re called query, key, and value. That’s just how it works. It’s entirely non-intuitive.

Ahusu · February 7, 2024, 10:38pm

When I try to pass a mask (mask=mask) as an argument, I receive this error message:TypeError: call() got an unexpected keyword argument ‘mask’

does this suggest I shouldn’t be passing a mask as an argument?

TMosh · February 7, 2024, 10:48pm

Could be that the “mask=” part is a problem. Do you know whether “mask =” is a valid parameter?

Deepti_Prasad · February 8, 2024, 5:11pm

@Ahusu

Could you share a screenshot of the type error you’re encountering, for better understanding of your issue.

Regards
DP

VeronikaS · July 27, 2025, 7:50am

For those who are still struggling to understand the appropriate parameter to use for ‘mask’, it is helpful to read the instructions before starting the exercise by clicking on ‘Additional hints’.

Isaac_Aktam · September 14, 2025, 6:37pm

Since when do we pass a mask in the MHA? I don’t remember the lecture covering it. The only part, if I recall correctly, where mask is mentioned in the decoder where it was mentioned that the Multi-Head Attention can also be called Masked Multi-Head Attention.

Honestly, assignment instructions could be so much clearer.

TMosh · September 14, 2025, 8:19pm

Yes, of all of the assignments in the Deep Learning Specialization, this one has the highest expectations of the students.

Isaac_Aktam · September 14, 2025, 9:05pm

Please make the instructions clearer. Not everyone here has Stanford level intellect.

TMosh · September 14, 2025, 10:20pm

Noted.

Topic		Replies	Views
C5 W4 A1 E4 - how to use self.mha Sequence Models coursera-platform	1	680	February 24, 2022
W4_Ex-4 \| (Encoder) - Stuck with no clue what to do next! Please help! Sequence Models coursera-platform	14	1749	November 9, 2022
C5_W4_A1 Exercise 4 Encoder Layer Sequence Models coursera-platform	15	1137	July 12, 2023
Course 5 Week4 Exercise 4 Sequence Models coursera-platform	7	1160	October 30, 2023
DL5, W4 Assignment. ValueError: The first argument to `Layer.call` must always be passed Sequence Models coursera-platform	7	708	July 24, 2023

Issues with computing self_mha_ouput

Related topics