Lab : Calculating perplexity

jyadav202 · April 14, 2024, 4:08pm

It seems like the following equation for perplexity for Bigram model is doing per-sentence instead of per-word perplexity:

PP(W) = \sqrt[m]{\prod_{i=1}^{m}\prod_{j=1}^{|s_i|} \frac{1}{P(w_j^{(i)}| w_{j-1}^{(i)})}}

where m is no. of sentences in test set W. |s_i| is the number of words in sentence i. Finally, w_j^{(i)} is j^{th} word in sentence i.

It would make sense to me if the -1/m in the above equation was actually N or \sum_{i=1}^{j}{|s_i|} , ie. sum of all words in all sentences.

If so, when I said:

it would mean:

i = m=1
|s_i| = |s_1| = N

Which gives:

PP(W) = \sqrt[N]{\prod_{i=1}^{N} \frac{1}{P(w_i| w_{i-1})}}

(NOTE: the lab denotes perplexity and probability both as P() which is super confusing.)

Anyway, your original query was about this above equation, which is correct and is derived from cross-entopy of the model for probability distribution P. I will not go into the proof but will direct you to this section 3.3 of this link: https://web.stanford.edu/~jurafsky/slp3/3.pdf

I would also take the suggestion to the content creator of the slides and lab to give clarity to this equation.

Thanks!

Topic		Replies	Views
C3_W1 - understanding Calculating perplexity lab NLP with Sequence Models week-1	1	24	January 7, 2025
Incorrect quize NLP with Probabilistic Models week-3	7	15	February 2, 2025
Confused about Perplexity Formula NLP with Probabilistic Models week-3	3	862	January 17, 2024
Q10 - calculate_perplexity NLP with Probabilistic Models week-3	8	742	December 29, 2022
W3 Quiz Q5 answer wrong? NLP with Probabilistic Models week-3	4	560	July 25, 2023

Lab : Calculating perplexity

Related topics