Pytorch implementation of an encoder-decoder with attention for neural machine translation

Hello can you tell me how can I implement this model in pytorch ? I already implemented some models like this before but without attention and now I want to implement this in pytorch . I have several questions and problems but mostly with how can I handle the data flow to both Encoder and the pre-attention decoder .(The model from the first weeks programing assignment for NMT )

Personally, I’d use this, since it’s already been written and debugged.

https://pytorch.org/docs/stable/generated/torch.nn.Transformer.html

Thanks . its mostly practicing . I’m trying to understand it as much as possible

Hi @Mahdi_Seddigh

I think you have a good approach - trying to understand things by building them. I did it too :slight_smile: (implemented the labs with PyTorch and in Excel).

You might find this post helpful if you feel stuck.

Cheers