C4W1_Assignment exercise 3 - decoder

pongyuenlam · May 24, 2024, 4:31am

I had defined the following in init for Decoder.
{codes removed by moderator as they are part of graded cell
)

However, I got error as shown. Any hint?

Deepti_Prasad · May 24, 2024, 5:56am

for the code line

Pass the embedded input into the pre attention LSTM
Hints:
The LSTM you defined earlier should return the output alongside the state (made up of two tensors)
Pass in the state to the LSTM (needed for inference)

the argument to call the decode layer clearly states you to use state as None, but you have used initial_state=state causing this error. It should be initial_state=None.

another issue I can see for code line
Get the embedding of the input, you have used incorrect input, remember it is right shifted translation used as input.
For the below code line
Perform cross attention between the context and the output of the LSTM (in that order)
YOU ARE SUPPOSE TO USE THE RECALLED ATTENTION LAYER FOR CrossAttention and not CrossAttention directly

Regards
DP

pongyuenlam · May 24, 2024, 8:31am

Dear @Deepti_Prasad,
Thanks. I have made changes. I have printed out x.shape after each key step, and got the following error. If I do not execute post_attention_layer, then the shape of logits can be (64, 15, 12000). Do you have any advice?

Deepti_Prasad · May 24, 2024, 10:38am

Hi @pongyuenlam

Your Logit shape has missing unit when compared to the expected output.

So first I would check the below two codes in the def call for decode layer being called correctly or not

Do a pass through the post attention LSTM
x = MAKE SURE YOU HAVE RECALLED THIS WITH THE CORRECT POST ATTENTION LSTM, WHICH would be ACCORDING TO THE RNN AFTER ATTENTION i.e. ???(x)

Compute the log-its
FOR THIS YOU ARE SUPPOSE TO USE THE RECALL LAYER FROM THE DENSE LAYER WITH LOGSOFTMAX ACTIVATION, WHICH WOULD BE??

If the above to codes were recalled correctly then check the below two code lines

The RNN after attention (did you recall unit correctly? and return_sequences as??
The dense layer with logsoftmax activation, here units is not units, check the instructions section mentioned for this layer. This one should have the same number of units as the size of the vocabulary since you expect it to compute the logits for every possible word in the vocabulary.

if all these as were mentioned correctly, then DM me a screenshot of the grade cell Decode, chances are your previous cell won’t have matching units.

Regards
DP

pongyuenlam · May 24, 2024, 10:55am

Hi @Deepti_Prasad
Thanks! After I have followed your advice and made one change for RNN after attention on return_sequences, I can pass all tests for exercise 3 now. Thank you so much!

Topic		Replies	Views
Exercise 3 error: Incorrect shape of decorder output NLP with Attention Models week-module-1	2	23	January 19, 2025
C4W1 Assignment - Exercise 3 Decoder Function NLP with Attention Models week-module-1	6	357	May 24, 2024
NLP C4W1 Exercise 3 Decoder NLP with Attention Models week-module-1	2	453	January 12, 2024
Dimensional size error for C4W1_Assignment Decoder test NLP with Attention Models week-module-1	29	935	March 18, 2025
C4W1_Assigment_Exercise 3 - Decoder NLP with Attention Models week-module-1	12	588	August 13, 2024

C4W1_Assignment exercise 3 - decoder

Related topics