When testing the fit_tokenizer function, I get:
Vocabulary contains 27284 words
token NOT included in vocabulary
Instead of the expected:
Vocabulary contains 27285 words
token included in vocabulary
Any ideas why this might be the case?
When testing the fit_tokenizer function, I get:
Vocabulary contains 27284 words
token NOT included in vocabulary
Instead of the expected:
Vocabulary contains 27285 words
token included in vocabulary
Any ideas why this might be the case?
Set the oov_token
Great, that worked! Thanks!