C5_W4_A1_Ex3_Poor_Instruction

Exercise 3 instruction is not informative at all, does not help and bring lots of frustration!

  1. what does scale_matmul_qk really mean? isn’t it easier to say dk = tf.shape(k)[-1] # seq_len_k which btw returns error!
  2. There were no clear explanations regarding Multiply (1. - mask) by -1e9 before applying the softmax. Why is it exactly (1. - mask)*-1e9 and not just mask*-1e9 ?
  3. Could have been better to add tf.nn.sotfmax(..., axis=...) in the additional hints to remind using softmax from tensorflow!
  4. Could have been better to add tf.cast(x, dtype, name=None) in the additional hints to explain why it is required to change dk’s type to ignore the InvalidArgumentError: Value for attr 'T' of int32 is not in the list of allowed values!

Cheers,

Thanks for your list of issues.

I got stuck for more than 2 hours, when I missed the Reminder: The boolean mask parameter can be passed in as none or as either padding or look-ahead.

Multiply (1. - mask) by -1e9 before applying the softmax. 

bit…