I am having trouble calculating accurate perplexity. My results in the testing block differ from expected values.
I am getting the following results:
Perplexity for first train sample: 2.3348
Perplexity for test sample: 2.8040
Rather than expected:
Perplexity for first train sample: 2.8040
Perplexity for test sample: 3.9654
I believe I am traversing the sentence properly. I calculate the probability using the estimate_probability function (all tests passed for that) and then use this probability in the denominator for product_pi (1 in numerator). Do you have any suggestions?
Hmmm, well, they wrote most of the code for you in the template, so there aren’t that many places to go wrong. And what you described all sounds good. The only tricky thing I can see in the instructions is what the range values are on the “for” loop. Are you sure you interpreted what they said about the upper end of the range correctly? The point is the “inclusive” part of that comment. If you say range(1,3)
in python, that does not include the 3 right?
Thanks. That helped. Like you said most of the code has been provided. I took the comment too literally when it said range from n to N-1. That is what I put in the range rather than making sure it iterated to N-1.
1 Like