The last function in the assignment in C3: W1 assigns to the first element in the vocab although I did not add oov_token=“” as an argument when initialized the Tokenizer. Any idea?
Please click my name and message your notebook as an attachment.
In function train_val_split, you should not assign a constant value like 1780 to train_size. Use function parameters to arrive at the value of train_size. If training_split is 0.8, then training data should be 80% of the rows and validation set should contain the remaining 20 % of the rows.
no sorry, i sent you the wrong file
this is the correct notebook about OOV
[code removed - moderator]
def tokenize_labels(labels) has a bug. There’s no need to use OOV when tokenizing labels.
but i didn’t pass additional arguments in tokenize_labels, why the result show like this ?
You are calling fit_tokenizer for creating a tokenizer for the labels. This function assigns oov token and hence the problem.
i tried to delete this but it’s wasn’t make any sense
You should create a tokenizer without oov. Please think about it. The dataset has labels for inputs. Why would you need an oov token for the labels?
What happen, Let me see too
