C4W1_Assignment : Tensor of logits dimension mismatch

Tensor of contexts has shape: (64, 14, 256)
Tensor of right-shifted translations has shape: (64, 15)
Tensor of logits has shape: (64, 12000)

Expected Output
Tensor of contexts has shape: (64, 14, 256)
Tensor of right-shifted translations has shape: (64, 15)
Tensor of logits has shape: (64, 15, 12000)

I am unable to debug why I am getting a logit shape of (64, 12000). Where could it have gone wrong?

The difference between a 2D tensor and a 3D tensor is pretty fundamental, right? So you must be misinterpreting the operations that the math formulas are telling you to do. One way to debug this is to add print statements in the relevant parts of the code to print the shapes of all the objects. That should at least let you narrow it down to the line of code that is incorrect.

I’m not familiar with the NLP C4 material, but have done the DLS equivalent in DLS C5 W4. One thing to be careful about is the notational conventions for the difference between dot product style multiply and elementwise multiply. In DLS Prof Ng is consistent in that he always and only uses “*” as the operator to signify “elementwise” multiply. If he does not write an explicit operator between two tensors or array objects, then the operation is “dot product multiply”.

1 Like

Thank you for your response. My actual confusion is in the C4W1_assignment’s decoder’s layers. I am guessing I have misplaced some parameters of the decoder’s layers as a result the tensor has a different dimension than expected and I have been trying to figure it out.

I figured it out. Thanks.

Nice work! Thanks for confirming.

I’ve had the same problem, can you tell me how I can solve it ? Thank you so much !!!

same here I get

“Tensor of logits has shape: (64, 12000)”

and I have no idea how to fix it

finally I figured out, in the instructions it says:

" Post-attention LSTM. Another LSTM layer. For this one you don’t need it to return the state."

but I missinterpreted it as return_sequences=False when it should be return_sequences=True

hope this helps

1 Like

In my case, I was not returning the sequences in the post attention rnn. please check that.

1 Like