Hello, I have an issue with Week 1, Exercise 3 of Course 4 NLP Attention Models. The problem arises when displaying the output; the shape of the logits does not match the expected:
64, 15, 256
instead of:
64, 15, 12000.
Tensor of contexts has shape: (64, 14, 256)
Tensor of right-shifted translations has shape: (64, 15)
Tensor of logits has shape: (64, 15, 256)
Expected Output
Tensor of contexts has shape: (64, 14, 256)
Tensor of right-shifted translations has shape: (64, 15)
Tensor of logits has shape: (64, 15, 12000)
I’m unsure where the error lies but suspect it might be in the CrossAttention or PostAttention.
In CrossAttention, I passed context and x as parameters. In PostAttention, only x is passed as a parameter.
Any ideas on what I might be doing wrong?
Thank you in advance.
Regards.