NLP with Attention Models C4W2 - Exercise 5 - next_word “softmax_404” error

Hi @John_Murphy1

Issues with your assignment as per grade cell

GRADED FUNCTION: scaled_dot_product_attention
1.softmax is normalized on the last axis (seq_len_k) so that the scores add up to 1.
YOU DONT REQUIRE AXIS FOR THIS CODE RECALL

GRADED FUNCTION: DecoderLayer
For Block 1, it is clearly mentioned Dropout will be applied during training, so you do not require training=training for your self-attention mult_attn_out1. Same issue with mult_attn_out2

GRADED FUNCTION: Decoder
No mistakes

GRADED FUNCTION: Transformer
No mistakes

GRADED FUNCTION: next_word
For the code line you were suppose to use create_padding_mask but you have used create_look_ahead_mask
Create a look-ahead mask for the output THIS IS THE MAIN REASON BEHIND YOUR ERROR

Regards
DP