Hi @pongyuenlam
- for the code line
Pass the embedded input into the pre attention LSTM
Hints:
The LSTM you defined earlier should return the output alongside the state (made up of two tensors)
Pass in the state to the LSTM (needed for inference)
the argument to call the decode layer clearly states you to use state as None, but you have used initial_state=state causing this error. It should be initial_state=None.
-
another issue I can see for code line
Get the embedding of the input, you have used incorrect input, remember it is right shifted translation used as input. -
For the below code line
Perform cross attention between the context and the output of the LSTM (in that order)
YOU ARE SUPPOSE TO USE THE RECALLED ATTENTION LAYER FOR CrossAttention and not CrossAttention directly
Regards
DP