C3W1_Assignment fit_tokenizer() 00V problem

Hi guys!
I have a problem with the 00V token detection. When i fit the tokenizer and i sequence the senteces, i get the correct number of words in the vocabulary but also i get “ token NOT included in vocabulary”.

I initialized the tokenizer as i did during the class in the same way, but still dont get the 00V included in the vocabulary.

Any tips?

My lab: ldzjcyddmhcu

Please click my name and message your notebook as an attachment.

Please find the feedback below:

  1. The out of vocabulary token to use is <OOV> and not <00V> (not zeros but upper case alphabet i.e. O)
  2. See the function fit_tokenizer to understand how word to index mapping is referenced from inside the tokenizer instance in the test code. This will help fix the mistakes in tokenize_labels
1 Like