Lab : Calculating perplexity

Ibrahim_RIDENE · April 13, 2024, 7:19pm

Hi, The formula of the perplexity in the Lab is not the same as the formula discussed during course 2 of the specialization.
This is the formula that we have in the second course( same as wikipedia)

The one in the Lab is

where we have the same number N

Deepti_Prasad · April 13, 2024, 9:02pm

Hello @Ibrahim_RIDENE

Kindly share the wikipedia link you are referring remember, so a proper explanation can be provided after reviewing and comparing both lab and wikipedia information as stated by you.

Regards
DP

jyadav202 · April 14, 2024, 5:11am

Hi @Ibrahim_RIDENE ,

Both formula’s are correct. The second one is derived from the first in case of bi-gram model, with the condition that all the sentences in the test set gets concatenated.

This is talked about in Language Model Evaluation chapter of Autocomplete at timestamp 5:00

Ibrahim_RIDENE · April 14, 2024, 11:42am

Hi,
Please find the link to the wikipedia formula: Perplexity - Wikipedia.

@jyadav202 , I would like to mention that even the formula that you provided is not totally correct following this explanation

The m of the root is not the same as the m of the first product inside the root. The first one must be the number of the distinct words in the corpus, the second one is the number of sequences in the corpus

jyadav202 · April 14, 2024, 4:08pm

It seems like the following equation for perplexity for Bigram model is doing per-sentence instead of per-word perplexity:

PP(W) = \sqrt[m]{\prod_{i=1}^{m}\prod_{j=1}^{|s_i|} \frac{1}{P(w_j^{(i)}| w_{j-1}^{(i)})}}

where m is no. of sentences in test set W. |s_i| is the number of words in sentence i. Finally, w_j^{(i)} is j^{th} word in sentence i.

It would make sense to me if the -1/m in the above equation was actually N or \sum_{i=1}^{j}{|s_i|} , ie. sum of all words in all sentences.

If so, when I said:

it would mean:

i = m=1
|s_i| = |s_1| = N

Which gives:

PP(W) = \sqrt[N]{\prod_{i=1}^{N} \frac{1}{P(w_i| w_{i-1})}}

(NOTE: the lab denotes perplexity and probability both as P() which is super confusing.)

Anyway, your original query was about this above equation, which is correct and is derived from cross-entopy of the model for probability distribution P. I will not go into the proof but will direct you to this section 3.3 of this link: https://web.stanford.edu/~jurafsky/slp3/3.pdf

I would also take the suggestion to the content creator of the slides and lab to give clarity to this equation.

Thanks!

Ibrahim_RIDENE · April 14, 2024, 5:03pm

Thanks for the clarification ! Really appeciate it

Topic		Replies	Views
Incorrect quize NLP with Probabilistic Models week-module-3	7	18	February 2, 2025
C3_W1 - understanding Calculating perplexity lab NLP with Sequence Models week-module-1	1	27	January 7, 2025
Confused about Perplexity Formula NLP with Probabilistic Models week-module-3	3	904	January 17, 2024
Question in the video NLP with Probabilistic Models week-module-2	5	465	June 12, 2023
Possible error - C3_W2_lecture_nb_3_perplexity NLP with Sequence Models week-module-2	4	560	November 30, 2022

Lab : Calculating perplexity

Related topics