NMT Week-1 Assignment - Training

Pradeep_Das · January 5, 2024, 12:56am

Week1:
In the NMT training code with LSTM and Cross Attention, the whole right-shifted target is passed to the decoder, which is also passed to the cross-attention, while the same decoder is called for each word prediction during the prediction time.

Does it mean multiple decoder hidden states are passed to cross attention and cross attention does the right thing to compute the attention with each decoder hidden state in one go during training??

TMosh · January 5, 2024, 1:02am

Is this a question about one of the courses? Or about your own project?

lukmanaj · January 5, 2024, 7:59am

Please which course are you taking? Your question relates to the assignment in NLP course 4 week 1 and I have moved it there. Hopefully the mentors for the course can answer you.

jyadav202 · January 6, 2024, 6:39am

Hi!

You are right about how decoder is used in training and prediction time.

In the code, during training, pre-attention decoder hidden state of LSTM are not passed to the attention mechanism. This is because we are using the shifted-right target sequence to attend the correct next token. This target sequence passes through the pre-attention decoder and only the output of the LSTM along with the encoded context is used in attention mechanism as query and value respectively.

I am referring a previous discussion on this. FYI, here implementation in paper is discussed which is slightly different than the TF code.

Pradeep_Das · February 4, 2024, 6:37am

Thank You

Topic		Replies	Views
Video: NMT Model with Attention NLP with Attention Models week-module-1	5	382	December 21, 2023
C4W1_Assignment - Translator NLP with Attention Models week-module-1	2	389	March 20, 2024
C4W1_Assignment exercise 3 - decoder NLP with Attention Models week-module-1	4	281	May 24, 2024
NMT with Attention Model NLP with Attention Models	2	392	January 2, 2024
W1 seq2seq lecture question NLP with Attention Models week-module-1	8	290	March 15, 2024

NMT Week-1 Assignment - Training

Related topics