why is the right answer is the one with 1/4. the sentence length including special insertion is 5.
should not be log PP(W) = (-1/5)(-113) the right answer vs. log PP(W) = (-1/4)(-113)
Given the logarithm of these conditional probabilities:
log(P(Mary|))=-2
log(P(|cats))=-1
log(P(likes|Mary)) =-10
log(P(cats|likes))=-100
Assuming our test set is W=“ Mary likes cats ”, what is the model’s perplexity.
log PP(W) = (-1/5)*(-113)
log PP(W) = (-1/4)*(-113)
log PP(W) = -113
log PP(W) = (-1/5)*113