What is a token and how to code it ? What are the decoder inputs, are they the same as the encoder inputs ?
Hi Zerna_Lahouari,
A token is a numerical representation of a word, part of a word, or character - depending on the specifics of the model. Tokens are calculated in a process called tokenization, where texts are transformed from sentences into sequences of tokens by a tokenizer object/function. You can find an example of the code of a tokenizer here.
Decoder inputs can be similar to encoder inputs if they are used to calibrate (‘train’) the model, e.g. in case of a chatbot. The model aims to predict the next token, so during calibration the next token needs to be provided to the decoder of the model to calculate the loss that steers the adjustment of the parameters of the model. Also, the correct next token is necessary to help finding the prediction of the token that follows.
At prediction time, the input to the decoder consists of the previously output tokens, which influences the prediction of the next output tokens.