I was trying the example of ‘’ after ‘cat’ with the function of q9 and I was getting a different probablity than doing it by hand. I realised that when you set previous_n_gram = tuple(previous_n_gram) and previous_n_gram is a string then the output is a tuple of all the characters. So it must be that previous_n_gram is always a list.
In the example in the next cell you try to estimate the probability of ‘cat’ after ‘a’ which inside the tuple() it output (‘a’,) anyway.
So it is better in the example to set ‘a’ in ‘[’,‘]’ like this:
tmp_prob = estimate_probability("cat", ["a"], unigram_counts, bigram_counts, len(unique_words), k=1)
Same in the following cell:
estimate_probabilities(["a"], unigram_counts, bigram_counts, unique_words, k=1)
Just saying