Problem with transformer

Hi @nguyen1207

This picture:

comes from the original paper - “Attention Is All You Need” (page 3). This architecture was originally created for translation (the left side inputs the language you want to translate from, the right inputs - the translation “so far”). This picture is often used when talking about transformers.

But the idea can be applied not only for translation - in this week (C4 W2) the transformer was applied to summarize . So as a result we do not use the left side (inputs in another language) - we only use the right side (slightly modified for our purpose).