in the case of Neural Machine Translation (NMT) model to translate human-readable dates (β25th of June, 2009β) into machine-readable dates (β2009-06-25β).the post-attention LSTM at time π‘ does not take the previous time stepβs prediction π¦β¨π‘β1β© as input
if we require an attention model for which the post lstm layer should have inputs both y(t-1) (the previous output) and the context vector at that time step(for example in case of machine translation)
can we just concatenate both y(t-1) and context vectors or can we give two inputs to the lstm layer by using the call arguments βinputsβ
Were you able to find an answer for your question?