# GRADED FUNCTION: tokenize_labels

The last function in the assignment in C3: W1 assigns to the first element in the vocab although I did not add oov_token=“” as an argument when initialized the Tokenizer. Any idea?

Please click my name and message your notebook as an attachment.


In function train_val_split, you should not assign a constant value like 1780 to train_size. Use function parameters to arrive at the value of train_size. If training_split is 0.8, then training data should be 80% of the rows and validation set should contain the remaining 20 % of the rows.

no sorry, i sent you the wrong file
this is the correct notebook about OOV

[code removed - moderator]

def tokenize_labels(labels) has a bug. There’s no need to use OOV when tokenizing labels.

but i didn’t pass additional arguments in tokenize_labels, why the result show like this ?

You are calling fit_tokenizer for creating a tokenizer for the labels. This function assigns oov token and hence the problem.

i tried to delete this but it’s wasn’t make any sense

You should create a tokenizer without oov. Please think about it. The dataset has labels for inputs. Why would you need an oov token for the labels?

What happen, Let me see too