NLP C4W1 Exercise 3 Decoder

Hello, I have an issue with Week 1, Exercise 3 of Course 4 NLP Attention Models. The problem arises when displaying the output; the shape of the logits does not match the expected:
64, 15, 256
instead of:
64, 15, 12000.

Tensor of contexts has shape: (64, 14, 256)
Tensor of right-shifted translations has shape: (64, 15)
Tensor of logits has shape: (64, 15, 256)

Expected Output
Tensor of contexts has shape: (64, 14, 256)
Tensor of right-shifted translations has shape: (64, 15)
Tensor of logits has shape: (64, 15, 12000)

I’m unsure where the error lies but suspect it might be in the CrossAttention or PostAttention.

In CrossAttention, I passed context and x as parameters. In PostAttention, only x is passed as a parameter.

Any ideas on what I might be doing wrong?

Thank you in advance.
Regards.

The issue should be here:

The dense layer with logsoftmax activation

    self.output_layer

specifically the units parameter here!

1 Like

Thank you very much. I hadn’t noticed that I input the unit incorrectly. I focused on the call and the other layers (CrossAttention and PostAttention).
:+1:

2 Likes