C4W2_Assignment - Excercise 4 Transformer

dani7991 · January 25, 2025, 7:38pm

Hello,
I am getting an error in the Excercise 4, when passing the tests but i can’t see where is the mistake, i am also getting values really close to the real ones:

Then if i continue with Excercise 5, I pass all the tests again even when in cell 30 i am not getting the expected output.
Later in cell 33 i am getting this error:
‘’’
There was a problem compiling the code from your notebook. Details:
Exception encountered when calling layer ‘softmax_2’ (type Softmax).

{{function_node _wrapped__AddV2_device/job:localhost/replica:0/task:0/device:CPU:0}} Incompatible shapes: [1,2,150,150] vs. [1,1,2,2] [Op:AddV2] name:

Call arguments received by layer ‘softmax_2’ (type Softmax):
• inputs=tf.Tensor(shape=(1, 2, 150, 150), dtype=float32)
• mask=tf.Tensor(shape=(1, 1, 2, 2), dtype=float32)‘’’

and if i grade the notebook i get a 0 with the following error again in all the grades:
‘’’
Although, acroos the whole notebook i’ve been facing this kind of warnings even after having deleted the files, rebooted and got the latest version:

Thanks you for your time!

gent.spah · January 26, 2025, 11:11am

I would start by checking the outputs produced by the create_padding_mask function and create_look_ahead_mask function. As you can see

the mask and inputs do not have the same shapes!

JOSE_DANIEL_HERNANDE · January 26, 2025, 1:20pm

but this is a lock cell i am not suppose to overwrite isn’t it?

Deepti_Prasad · January 26, 2025, 2:19pm

@JOSE_DANIEL_HERNANDE

@gent.spah response is to the query asked by @dani7991. So kindly create your own topic with screenshot of your error or output different than expected.

what @gent.spah means that error log is telling, the issue lies somewhere in the previous grade cell.

from that error detail what I can understand the inputs are not of same shape. so the most probable reason would be while mentions inputs, the post creator learner must have added an extra dimension by mentioning input as (None,1) instead which should have been (1) with the correct tf. string datatype.

Regards
DP

dani7991 · January 26, 2025, 2:43pm

sorry, it was me but i didn’t see i was logged on the other account

and answering @Deepti_Prasad , I have not added any extra dim, i can’t post my code but i haven’t modified any input of the transformer function and neither the decoder

Deepti_Prasad · January 26, 2025, 3:21pm

oh okay I didn’t know people have two accounts in discourse community.

can you share screenshot of the codes by personal DM from the grade cell it showed first error.

that cuda screenshot is only warning related to tensorflow. but the. copy paste error you mentioned does mentions your inputs are not of same shape, that could also be related encoder or decoder codes

Regards
DP

Deepti_Prasad · January 26, 2025, 4:56pm

hi @dani7991

in GRADED FUNCTION: scaled_dot_product_attention
for code line
softmax is normalized on the last axis (seq_len_k) so that the scores add up to 1.
You didn’t required axis argument as it is already normalised to 1.
In exercise 2, for BLOCK1 AND BLOCK2, while calculating self-attention, you added training=training which was not required for these steps as the instructionsentions dropout is already mentioned in multihead attention layer.

3.In Exercise 4 Transformer, for code line

pass decoder output through a linear layer and softmax, you have again included training=training, which was not required because as per the below instruction

Finally, after the Nth Decoder layer, one dense layer and a softmax are applied to generate prediction for the next output in your sequence

you already used in the previous step while using call function for self.encoder

Let me know if issue gets resolved.

Deepti_Prasad · January 27, 2025, 7:51pm

can you post here the error you got now. the error log you shared is related to the next word grade cell code as with your current error it clearly states your code for prediction_id is incorrectly sequenced.

Kindly post your query clearly to avoid confusion. your main error was invalid argument which you only shared a part of it in the first comment you asked for your issue.

codes are always interconnected. see the last image in the first comment you posted, in the screenshot you only included the header part without sharing the whole error log.

dani7991 · January 28, 2025, 11:05am

sure
this is the whole trace

Deepti_Prasad · January 28, 2025, 3:08pm

notice the prediction id for next word, you have sequenced it incorrectly. mention first input, then output, and lastly model.

dani7991 · January 28, 2025, 5:34pm

but this is the call of the function which cell is blocked

and this is the next_word function, so the params are model, input and output

I am still not sure but i think the problem is above, the tests from the Transformer (Exercise 4) are still failing

Maybe we can dive into that part in private to show you the code?

Deepti_Prasad · January 28, 2025, 5:41pm

yes @dani7991

I want to see your codes from first grade cell till the error was thrown after the correction you did.

chances are issue could be with masking.

Also I would advise to get a fresh copy and re-do the assignment.

Deepti_Prasad · January 28, 2025, 6:36pm

hi @dani7991

Exercise 4, Transformer under def call statement, for code line

call self.decoder with the appropriate arguments to get the decoder output

You have used incorrect argument for dec_output, you are suppose to use output_sequence and not input sequence.

dani7991 · January 29, 2025, 5:38pm

omg!! Finally, it was that, so many thanks!!!

Topic		Replies	Views
C4W2 - Grading Error NLP with Attention Models	4	547	February 13, 2024
C4W2 cannot graded NLP with Attention Models week-module-2	14	907	October 29, 2024
C4W2 Assignment DecoderLayer NLP with Attention Models week-module-2	7	538	April 19, 2024
C3W2 Assigment : Grading Error with Exercice 4 Masked Loss - Expected output OK & All tests passed NLP with Sequence Models week-module-2	5	50	January 29, 2025
C4W2_Ex4-Transformer: wrong shape for transformer NLP with Attention Models week-module-2	5	210	May 6, 2024

C4W2_Assignment - Excercise 4 Transformer

Related topics