Hi,
I am working on this week’s assignment: autocomplete. Till the bottom parts, it seems we haven’t really used the perplexity. And we haven’t really used test_data either. I guess when we find the perplexity as about 4 that’s for a toy model test sentence:
However, we haven’t done a test on our test data for perplexity, am I right or I am missing something here?
Hi @Fei_Li
Why would we have used it earlier? Perplexity is a measurement for model evaluation (eg. how well we trained our model).
We have used it in training (unless you didn’t use them ) in # UNQ_C4 :
bare_eval_generator = ...
where we use eval_lines
(declared earlier at the top of the notebook: eval_lines = lines[-1000:] # Create a holdout validation set
)
Yes, we did not measure perplexity on the test data (we do it in # UNQ_C6 only on a single batch of training data).
Cheers
Hi Mentor @arvyzukai , I understand all others except this part. Mine UNQ_C4 is def count_words(tokenized_sentences)
I think I miss placed this thread. I will move it to C2W3. Does this make you confused? Sorry about that. Would you please take a look again? Thank you very much.
Hi @Fei_Li
Yes, you are correct, the wrong topic category put me off rails and my response was not for C2W3.
You are correct and to be fair, it’s strange I haven’t noticed it previously
The test sentence in calculating perplexity is just some random sentence.
Cheers