NLP C4W1 Exercise 3 Decoder

LevValenzuela · January 12, 2024, 4:34am

Hello, I have an issue with Week 1, Exercise 3 of Course 4 NLP Attention Models. The problem arises when displaying the output; the shape of the logits does not match the expected:
64, 15, 256
instead of:
64, 15, 12000.

Tensor of contexts has shape: (64, 14, 256)
Tensor of right-shifted translations has shape: (64, 15)
Tensor of logits has shape: (64, 15, 256)

Expected Output
Tensor of contexts has shape: (64, 14, 256)
Tensor of right-shifted translations has shape: (64, 15)
Tensor of logits has shape: (64, 15, 12000)

I’m unsure where the error lies but suspect it might be in the CrossAttention or PostAttention.

In CrossAttention, I passed context and x as parameters. In PostAttention, only x is passed as a parameter.

Any ideas on what I might be doing wrong?

Thank you in advance.
Regards.

gent.spah · January 12, 2024, 6:41am

The issue should be here:

The dense layer with logsoftmax activation

    self.output_layer

specifically the units parameter here!

LevValenzuela · January 12, 2024, 7:01am

Thank you very much. I hadn’t noticed that I input the unit incorrectly. I focused on the call and the other layers (CrossAttention and PostAttention).

Topic		Replies	Views
Exercise 3 error: Incorrect shape of decorder output NLP with Attention Models week-1	2	22	January 19, 2025
Dimensional size error for C4W1_Assignment Decoder test NLP with Attention Models week-1	29	931	March 18, 2025
C4W1_Assignment : Tensor of logits dimension mismatch NLP with Attention Models week-1	9	360	June 15, 2024
Uable to pass w1_unittest.test_decoder(Decoder, CrossAttention) NLP with Attention Models week-1	2	341	March 3, 2024
C4W1 - Cross Attention Exercise 2 and 3 NLP with Attention Models week-1	10	546	April 12, 2024

NLP C4W1 Exercise 3 Decoder

The dense layer with logsoftmax activation

Related topics