Perplexity Formula

The course materials define perplexity using M (number of sentences) in the exponent, but the quiz seems to expect the per-word version using N (total number of tokens). Per-word convention (dividing by N) is used in NLP textbooks like Jurafsky & Martin’s “Speech and Language Processing.”

Would it be possible to clarify these? Thank you so much for your time!