Hello,
I am getting an error in the Excercise 4, when passing the tests but i can’t see where is the mistake, i am also getting values really close to the real ones:
Then if i continue with Excercise 5, I pass all the tests again even when in cell 30 i am not getting the expected output.
Later in cell 33 i am getting this error:
‘’’
There was a problem compiling the code from your notebook. Details:
Exception encountered when calling layer ‘softmax_2’ (type Softmax).
{{function_node _wrapped__AddV2_device/job:localhost/replica:0/task:0/device:CPU:0}} Incompatible shapes: [1,2,150,150] vs. [1,1,2,2] [Op:AddV2] name:
Call arguments received by layer ‘softmax_2’ (type Softmax):
• inputs=tf.Tensor(shape=(1, 2, 150, 150), dtype=float32)
• mask=tf.Tensor(shape=(1, 1, 2, 2), dtype=float32)‘’’
and if i grade the notebook i get a 0 with the following error again in all the grades:
‘’’
Although, acroos the whole notebook i’ve been facing this kind of warnings even after having deleted the files, rebooted and got the latest version:
@gent.spah response is to the query asked by @dani7991. So kindly create your own topic with screenshot of your error or output different than expected.
what @gent.spah means that error log is telling, the issue lies somewhere in the previous grade cell.
from that error detail what I can understand the inputs are not of same shape. so the most probable reason would be while mentions inputs, the post creator learner must have added an extra dimension by mentioning input as (None,1) instead which should have been (1) with the correct tf. string datatype.
sorry, it was me but i didn’t see i was logged on the other account
and answering @Deepti_Prasad , I have not added any extra dim, i can’t post my code but i haven’t modified any input of the transformer function and neither the decoder
oh okay I didn’t know people have two accounts in discourse community.
can you share screenshot of the codes by personal DM from the grade cell it showed first error.
that cuda screenshot is only warning related to tensorflow. but the. copy paste error you mentioned does mentions your inputs are not of same shape, that could also be related encoder or decoder codes
in GRADED FUNCTION: scaled_dot_product_attention
for code line
softmax is normalized on the last axis (seq_len_k) so that the scores add up to 1.
You didn’t required axis argument as it is already normalised to 1.
In exercise 2, for BLOCK1 AND BLOCK2, while calculating self-attention, you added training=training which was not required for these steps as the instructionsentions dropout is already mentioned in multihead attention layer.
3.In Exercise 4 Transformer, for code line
pass decoder output through a linear layer and softmax, you have again included training=training, which was not required because as per the below instruction
Finally, after the Nth Decoder layer, one dense layer and a softmax are applied to generate prediction for the next output in your sequence
you already used in the previous step while using call function for self.encoder
can you post here the error you got now. the error log you shared is related to the next word grade cell code as with your current error it clearly states your code for prediction_id is incorrectly sequenced.
Kindly post your query clearly to avoid confusion. your main error was invalid argument which you only shared a part of it in the first comment you asked for your issue.
codes are always interconnected. see the last image in the first comment you posted, in the screenshot you only included the header part without sharing the whole error log.