C4W1_Assignment exercise 3 - decoder

I had defined the following in init for Decoder.
{codes removed by moderator as they are part of graded cell
)

However, I got error as shown. Any hint?

1 Like

Hi @pongyuenlam

  1. for the code line

Pass the embedded input into the pre attention LSTM
Hints:
The LSTM you defined earlier should return the output alongside the state (made up of two tensors)
Pass in the state to the LSTM (needed for inference)

the argument to call the decode layer clearly states you to use state as None, but you have used initial_state=state causing this error. It should be initial_state=None.

  1. another issue I can see for code line
    Get the embedding of the input, you have used incorrect input, remember it is right shifted translation used as input.

  2. For the below code line
    Perform cross attention between the context and the output of the LSTM (in that order)
    YOU ARE SUPPOSE TO USE THE RECALLED ATTENTION LAYER FOR CrossAttention and not CrossAttention directly

Regards
DP

Dear @Deepti_Prasad,
Thanks. I have made changes. I have printed out x.shape after each key step, and got the following error. If I do not execute post_attention_layer, then the shape of logits can be (64, 15, 12000). Do you have any advice?

1 Like

Hi @pongyuenlam

Your Logit shape has missing unit when compared to the expected output.

So first I would check the below two codes in the def call for decode layer being called correctly or not

Do a pass through the post attention LSTM
x = MAKE SURE YOU HAVE RECALLED THIS WITH THE CORRECT POST ATTENTION LSTM, WHICH would be ACCORDING TO THE RNN AFTER ATTENTION i.e. ???(x)

Compute the log-its
FOR THIS YOU ARE SUPPOSE TO USE THE RECALL LAYER FROM THE DENSE LAYER WITH LOGSOFTMAX ACTIVATION, WHICH WOULD BE??

If the above to codes were recalled correctly then check the below two code lines

  1. The RNN after attention (did you recall unit correctly? and return_sequences as??
  2. The dense layer with logsoftmax activation, here units is not units, check the instructions section mentioned for this layer. This one should have the same number of units as the size of the vocabulary since you expect it to compute the logits for every possible word in the vocabulary.

if all these as were mentioned correctly, then DM me a screenshot of the grade cell Decode, chances are your previous cell won’t have matching units.

Regards
DP

1 Like

Hi @Deepti_Prasad
Thanks! After I have followed your advice and made one change for RNN after attention on return_sequences, I can pass all tests for exercise 3 now. Thank you so much!

2 Likes