In the case of ‘small corpus’ with only 3 sentences, made up of 2 words yes and no.
At 3:30, it is said that formulation is shown on screen is for bigram probability. But isn’ it a trigram probability ?
P(<s> yes yes)
is trigram probability notation, considering the start token right ?
If start token is not considered as an element then why is it equated/expanded to P(yes/<s>) x P(yes/yes)
?