Need clarity on the probability of trigram with the help of a simple example

Hi @Shaleen_Srivastava

That is the precise reason for special tokens (start-of-sentence - <s>, and end-of-sentence - </s>). So, you corpus becomes <s> <s> I am happy, are you? Yes, I am </s>.

In this case the denominator stays the same as it should be - 2. And the numerator is 1 for both (P(happy | I am) and P(</s>| I am).

Cheers

1 Like