# How good is the Markov assumption for sequence probabilities?

So, in lectures we made the assumption that in a series,

P(tea | the teacher drinks) ~~ P(tea | drinks)

Or, more generally, that the probability of the next element in a sequence only depending on a limited history of the sequence, like the previous element in this case.

My question is, in practice, how good is this assumption?

Another way of looking at this is, it would make the same sequences have the same probability:

the teacher drinks tea
the squirrel drinks tea

Because in both cases we are not looking any farther back. Thus in the second sentence, after seeing â€śthe squirrel drinksâ€ť we are likely to predict â€śteaâ€ť as the next word!?

So I wonder, is this a good assumption, or just something we assume because we have to assume something when the full sequence is not found in our corpus? Thanks.

Youâ€™re right in questioning the assumption made during lectures that the probability of the next element in a sequence only depends on a limited history of the sequence, like the previous element. In practice, this assumption is not ideal for NLP tasks because language often exhibits more complex dependencies beyond just the immediate context.

While this kind of assumption may be suitable for simpler scenarios like chess or tic-tac-toe, where the current state is sufficient to make decisions (it doesnâ€™t matter how you got in this state, all that matters is the current state), it falls short when dealing with the complexity of natural language.

In the past, such assumptions were used as they provided a straightforward way to model language and could achieve simple language processing goals. And it can still be useful for you to know if you have simple applications in mind, and your focus is speed and less cost. However, with advancements in NLP (which will be covered later in the course), we can now better capture the rich and intricate patterns present in natural language.

In conclusion, itâ€™s essential to understand the historical context and simpler techniques, but for real-world NLP applications, more sophisticated tools are needed to handle the intricacies of language.

Cheers

1 Like

Thanks for that context!

1 Like