InvalidArgumentError: Exception encountered when calling layer ‘softmax_296’ (type Softmax).
Can not understand why my model summary is giving error when all above tests are passed. It is pointing error to my NextWord function but it produced the expected output in all exercises. Kindly assist where l should look at l have been trying but l think am blurring now.
Thank you buddy for the beneficial response. So in this case what do you suggest l do, I have been going though some functions and seen to hit brick wall, but am not stopping though, am going through from start and check as well.
If you have any further advice please do so. Thank you in advance
There are multiple probable causes for this error.
The first thing to check is the documentation for softmax. Note, that this function takes in just two arguments - input and the axis. In most cases, the axis is the default one - last one. So, probably the most often used case is just tf.keras.activations.softmax(inputs) in other words, it receives just the inputs.
The most probable place for this error should be in Exercise 1 - have your correctly computed scaled_attention_logits (which is the input for the softmax). Also, Exercise 4 the final layer uses softmax activation (which is already defined for you in Dense layer initialization and you don’t have call it here).
Thank you for the guide, l have also checked my softmax input function according to documentation. Would you mind if l share the code and may have a look if you have time. It is also showing there is error on my next_word function but all expected outcomes are met, and tests passed.
Although I feel there might be issue with other grader codes too but lets go step by step with the grader cell you have shared in DM
In the below code line, it is clearly mentioned you are only suppose to create mask for the output and there was no separate instruction given on shape related dimensions, so you using the output.shape[1] is creating the first error
Create a look-ahead mask for the output
Next in the below code line, notice it tells to create mask for input for decoder, so here using again output is an incorrect code of choice.
Create a padding mask for the input (decoder)
HINT FOR THIS WOULD BE AGAIN FROM A CELL BEFORE THIS EXERCISE WHICH MENTIONS
dec_padding_mask = create_padding_mask(inp) # Notice that both encoder and decoder padding masks are equal, so notice the input would be same for encoder and decoder padding masks.
Do these corrections, and let me know if still have any new error.
Deepti- Hi. About using “tf.nn.softmax”, are you referring to the class Transformer for the final layer? The following line of code:
self.final_layer = tf.keras.layers.Dense(target_vocab_size, activation=‘softmax’)
is before the " ### START CODE HERE ###"
Or are you referring to using ‘tf.nn.softmax’ in the scaled_dot_product_attention?
The ‘Additional Hints’ states:
Sorry that issue suggestion was not correct, and I completely forgot to do the edit. His issue was related to he was using incorrect recall for the mask steps in relation to encoder decoder.
Also if you have a similar issue, kindly always create a new topic and you can always tag a comment which you want use reference from other post in your post for explaining where all you referred and still are unsure of how to go about your error.
Next in the below code line, notice it tells to create mask for input for decoder, so here using again output is an incorrect code of choice.
Create a padding mask for the input (decoder)
HINT FOR THIS WOULD BE AGAIN FROM A CELL BEFORE THIS EXERCISE WHICH MENTIONS
dec_padding_mask = create_padding_mask(inp) # Notice that both encoder and decoder padding masks are equal, so notice the input would be same for encoder and decoder padding masks.
It’s been a while - hopefully, this guidance from Deepti resolves your issue.