Hi, I have trouble understanding the Perplexity concept. Namely, how it captures the Language model “quality”, do you have any intuitive explanation to this? And why it’s mentioned to be the same as Entropy? Entropy measures just the randomness?
Thanks!
Hi, @Maxim_Afteniy!
Perplexity is a measurement of how well a probability model predicts a sample. For NLP, perplexity is a commonly used metric for model evaluation.
Say that you have a test set with well written sentences. If your model is good enough, it will give those samples a high probability (low perplexity), which means it it not surprised or perplexed to see them.
About the second question, perplexity can also be defined as the exponential of the cross-entropy:
PP(W)=2^{H(W)}=2^{-\frac{1}{N}\log_2{P(w_1, w_2, ..., w_N)}}
For a deeper insight about it, check this article