Anyone know what do I do wrong?
if you notice your output to expected output, your only results match with logit shape
so I would check the cross attention as well as the decoder call function codes.
1 Like
I found the root cause. It was because I didn’t set the return_sequences=True
in the post_attention_rnn
, which should be identical to pre_attention_rnn
except for the change that return_state
should change to False
.
1 Like