In C4W1_Assignment, Exercise 3(Decoder) input is encoder output which is of shape 64, 14, 256
, so if you give it to Embedding layer with embedding dimension 256, output shape will be 64, 14, 256 256
, which can’t be given to LSTM as it gives error, expected input dimension 3, got 4. If we keep the embedding dimension 1 and pass x[:,:,:,0] to pre-atttention layer, there is problem in Cross Attention layer as it complains that “cannot compute Einsum as input #1(zero-based) was expected to be a int64 tensor but is a float tensor [Op:Einsum] name:”. There are no labs in this week which has such code, so totally clueless how to handle this.
You should not do that - embedding output should not be embedded again, it’s the output that the decoder should use as is (when calculating cross attention and nowhere else).
Thank you. I realized it later. I have to send “target” to Embedding layer and not “context”. If there would have been a block diagram showing flow, it would have been easier.