Start tokens and End token

mukund · March 21, 2022, 6:15pm

For a n-grams model, why do we use n-1 start tokens but only 1 end token?

arvyzukai · March 22, 2022, 5:08pm

Hi, mukund.

Think about what we are trying to predict - the next token. For example, given previous 3 tokens what is the next token? When we have the <EOS> token there is no need to predict the next token after it because it would be another <EOS> token.
In contrast, [<SOS>, <SOS>, ‘you’] is different from [<SOS>, ‘are’, ‘you’] and should produce different probabilities for the next token.

Topic		Replies	Views
Start and End Tokens NLP with Probabilistic Models week-module-2	4	898	December 14, 2023
Bug in inference for the assignment 3? NLP with Probabilistic Models week-module-3	5	504	May 1, 2023
C2_W3 assignment UNQ_C8 count_n_grams NLP with Probabilistic Models week-module-3	2	478	July 6, 2023
Video on Starting and Ending Sentences (Intuition issue) NLP with Probabilistic Models week-module-2	1	382	September 15, 2023
Adding start sequence tags and perplexity calculation NLP with Probabilistic Models week-module-3	4	566	August 23, 2023

Start tokens and End token

Related topics