Token, Encoder an decoder meaning

Zerna_Lahouari · July 1, 2023, 6:46pm

What is a token and how to code it ? What are the decoder inputs, are they the same as the encoder inputs ?

reinoudbosch · July 1, 2023, 9:01pm

Hi Zerna_Lahouari,

A token is a numerical representation of a word, part of a word, or character - depending on the specifics of the model. Tokens are calculated in a process called tokenization, where texts are transformed from sentences into sequences of tokens by a tokenizer object/function. You can find an example of the code of a tokenizer here.
Decoder inputs can be similar to encoder inputs if they are used to calibrate (‘train’) the model, e.g. in case of a chatbot. The model aims to predict the next token, so during calibration the next token needs to be provided to the decoder of the model to calculate the loss that steers the adjustment of the parameters of the model. Also, the correct next token is necessary to help finding the prediction of the token that follows.
At prediction time, the input to the decoder consists of the previously output tokens, which influences the prediction of the next output tokens.

Topic		Replies	Views
Very first input in decoder at bottom Generative AI with Large Language Models week-module-1	3	432	July 1, 2023
Predicting Next Set of Tokens in Decoder Model Generative AI with Large Language Models week-module-1	7	579	August 10, 2023
The tokens that decoder block use Sequence Models week-module-4 , coursera-platform	3	209	April 15, 2024
All previously generated tokens as decoder input or only the latest generated token as decoder input NLP with Attention Models week-module-1	2	34	July 14, 2024
Conceptual questions about encoder / decoder from the "Generating text with transformers" video Generative AI with Large Language Models week-module-1	1	202	April 13, 2024

Token, Encoder an decoder meaning

Related topics