Hi, I’m stack on the Exercise 10 - calculate_perplexity (first sentence [‘i’, ‘like’, ‘a’, ‘cat’]). I wrote the function but cant get the expected value of perplexity. So I have a question. Based on my table below where I’m wrong? Is it a wrong slicing or probability? Probability calculated with estimate_probability function.
Thank you for any hints)
Screenshot 2024-02-22 232742

Hi @Tyurey

Welcome to community!

It’s hard to know for sure, but your p(“i”|(<s>, i)) is clearly wrong - it should be 0.22222. Calculation:

  • Count of (<s>, i) = 1, k=1, so the nominator is 1+1=2,
  • Count of <s> is 2, and the vocab size is 7, so the denominator is 2+7=9,
  • So the smoothed p(“i”|(<s>, i)) = 2 / 9 = 0.22222 (not the 0.33333)

Maybe your numerator for some reason is 3? Maybe something else. Have you passed previous tests?

Hi @arvyzukai , and thank you for detailed response.
Yes, definitely my probabilities are wrong. And ratio as well. But that’s strange, cos I passed the estimate_probability test. In fact that is the only exercise I haven’t done. I’ll double check code. And put here my mistake.

1 Like

I have a similar question. I think I’m getting the correct probabilities, but my perplexity is off:

The actual perplexity for the first train sample is supposed to be: 2.8040

The instructions say to use back off when needed. However, I don’t think that’s the issue in this case because there are no missing bigrams in the dictionary.

Can anyone give me a hint?

Hi @cbusath

You got the probabilities right but you’re just missing the last probability. (your result is 162 ** (1/6), while what you need is 486 ** (1/6)). Your for loop range is probably 1 iteration short. Check the calculations:


  • sentences
    [[‘i’, ‘like’, ‘a’, ‘cat’], [‘this’, ‘dog’, ‘is’, ‘like’, ‘a’, ‘cat’]]
  • unique_words=7
    [‘is’, ‘like’, ‘i’, ‘dog’, ‘a’, ‘cat’, ‘this’]
  • unigram_counts
    {(‘<s>’,): 2, (‘i’,): 1, (‘like’,): 2, (‘a’,): 2, (‘cat’,): 2, (‘<e>’,): 2, (‘this’,): 1, (‘dog’,): 1, (‘is’,): 1}
  • bigram_counts
    {(‘<s>’, ‘<s>’): 2, (‘<s>’, ‘i’): 1, (‘i’, ‘like’): 1, (‘like’, ‘a’): 2, (‘a’, ‘cat’): 2, (‘cat’, ‘<e>’): 2, (‘<s>’, ‘this’): 1, (‘this’, ‘dog’): 1, (‘dog’, ‘is’): 1, (‘is’, ‘like’): 1}

The calculations for Exercise 10 - calculate_perplexity with [‘i’, ‘like’, ‘a’, ‘cat’] are:

For reference how the first two probabilities were estimated:


Got it! I’ll try this out. Thanks @arvyzukai! I’ve been staring at this one for like a week. :slight_smile:

1 Like