C3M3_Assignment - Decoder definition error

I have problem with decoder and task quite big to identify where is the error. Do you have any idea where do dig?

======================================================================
 Decoder - Main Layers
======================================================================

Layer                          Type                           Parameters
----------------------------------------------------------------------
token_emb                      Embedding                       1,280,000
pos_enc                        PositionalEncoding                      0
dropout                        Dropout                                 0
transformer_decoder            TransformerDecoder              2,372,352
output_projection              Linear                          1,285,000
----------------------------------------------------------------------
TOTAL                                                          4,937,352
======================================================================

Expected Output

======================================================================
 Decoder - Main Layers
======================================================================

Layer                          Type                           Parameters
----------------------------------------------------------------------
token_emb                      Embedding                       1,280,000
pos_enc                        PositionalEncoding                      0
dropout                        Dropout                                 0
transformer_decoder            TransformerDecoder              2,372,352
output_projection              Linear                          1,285,000
----------------------------------------------------------------------
TOTAL                                                          4,937,352
======================================================================

unittests.exercise_2(Decoder)

Failed test case: Failed forward pass test.
Expected: None
Got: shape '[1, 1, 8, 8]' is invalid for input of size 16

I think it has to do with you definition of self.token_emb. Try to follow the hints as closely as possible.

I think I skip Hint 7 but I cannot find any reference in lab template where it must be implemented.

    # Convert into token embeddings

I put new code here but error still exists.

Hint 7: Forward Pass - Embedding with Scaling
What you need: Embed tokens and scale by sqrt(d_model).

Hey, even I am struggling with this error. Did you find a solution for this error?

Failed test case: Failed forward pass test.
Expected: None
Got: shape ‘[1, 1, 1, 8]’ is invalid for input of size 16

Hi everyone!
Could you DM me your solutions? Several learners are encountering this error, and since our solution passes the unit tests, the issue may be with the tests themselves.

Hey Lucas,

I tried reading through the hints again. In forward pass adding positional encoding, I forgot to add to the embedded tensor after getting the positional encodings. The code works now, test passed!

Thanks

1 Like