How good is the Markov assumption for sequence probabilities?

Joel_Wigton · July 20, 2023, 6:14pm

So, in lectures we made the assumption that in a series,

P(tea | the teacher drinks) ~~ P(tea | drinks)

Or, more generally, that the probability of the next element in a sequence only depending on a limited history of the sequence, like the previous element in this case.

My question is, in practice, how good is this assumption?

Another way of looking at this is, it would make the same sequences have the same probability:

the teacher drinks tea
the squirrel drinks tea

Because in both cases we are not looking any farther back. Thus in the second sentence, after seeing “the squirrel drinks” we are likely to predict “tea” as the next word!?

So I wonder, is this a good assumption, or just something we assume because we have to assume something when the full sequence is not found in our corpus? Thanks.

arvyzukai · July 21, 2023, 7:08am

Hi @Joel_Wigton

You’re right in questioning the assumption made during lectures that the probability of the next element in a sequence only depends on a limited history of the sequence, like the previous element. In practice, this assumption is not ideal for NLP tasks because language often exhibits more complex dependencies beyond just the immediate context.

While this kind of assumption may be suitable for simpler scenarios like chess or tic-tac-toe, where the current state is sufficient to make decisions (it doesn’t matter how you got in this state, all that matters is the current state), it falls short when dealing with the complexity of natural language.

In the past, such assumptions were used as they provided a straightforward way to model language and could achieve simple language processing goals. And it can still be useful for you to know if you have simple applications in mind, and your focus is speed and less cost. However, with advancements in NLP (which will be covered later in the course), we can now better capture the rich and intricate patterns present in natural language.

In conclusion, it’s essential to understand the historical context and simpler techniques, but for real-world NLP applications, more sophisticated tools are needed to handle the intricacies of language.

Cheers

Joel_Wigton · July 21, 2023, 7:57pm

Thanks for that context!

Topic		Replies	Views
Markov assumption for Sequence Probabilities NLP with Probabilistic Models	1	263	December 24, 2021
Bug in inference for the assignment 3? NLP with Probabilistic Models week-3	5	501	May 1, 2023
Question about example - bigram (slide 34) NLP with Probabilistic Models week-3	2	415	July 4, 2023
LLM sequence probabilities AI Discussions llm	5	73	October 23, 2024
For the Exercise 9, anyone gets below answer? 0.1111 rather than 0.3333 NLP with Probabilistic Models week-3	4	604	September 19, 2022

How good is the Markov assumption for sequence probabilities?

Related topics