What is the difference between the two out put in the transformer

Amazing_Patrick · April 28, 2023, 9:09pm

What is the different between the two outputs in the transformer?

and why in the decoder the output in the bottom became input?

arvyzukai · May 2, 2023, 4:48am

Hi @Amazing_Patrick

The question is not totally clear - what are you asking about - in both cases the outputs are probabilities (the output of softmax layer), all depends in what context these models are used and “their” dataset.

For example, the top one is probably taken from the original paper, where the model was trained to translate text. So the inputs in the left branch was the sentences in one language while the inputs (called Outputs - or outputs “so far”) in the right branch are the translated text (so far). The outputs are probabilities for the next word in the language that is being translated.

The bottom picture could be for any generative model (when model attempts to generate text (exactly as it was in the dataset but usually with some added randomness)) for example, like chatbots.

Topic		Replies	Views
Transformer coding question NLP with Sequence Models week-4	1	260	December 8, 2023
C5_W4 Transformer - Flummoxed. Why do we pass the output sentence to the decoder Sequence Models	6	529	May 17, 2023
Problem with transformer NLP with Attention Models week-2	1	481	May 28, 2023
Questions about Transformer Models Generative AI with Large Language Models week-1	2	363	October 23, 2023
W4E8- confusing about the argument called output_setence Sequence Models	5	526	June 16, 2022

What is the difference between the two out put in the transformer

Related topics