I am trying to create the labels for the training and validation set in C3W2 assignment (BBC news archive). I fit the tokenizer on all the labels then created label_seq
as a sequence of split_labels
. For some reason the first 5 labels for my training set come out properly but the first 5 labels of my validation set show [list([]) list([]) list([]) list([]) list([])]
. Also, the shape is wrong for my validation set: they show (445,) instead of the correct shape. Any idea how to fix this?
Odds are good that you are invoking fit_on_sequences
on the label tokenizer. Please use the correct method to fit the labels.
Hi, I met an issue when doing this part.
I fit the texts with all labels, and created sequences with split_labels, but got the following results, I can not figure out which part is wrong.
First 5 labels of the training set should look like this:
[[87]
- [22]*
- [40]*
- [40]*
- [74]]*
First 5 labels of the validation set should look like this:
[[25]
- [26]*
- [23]*
- [14]*
- [14]]*
Tokenized labels of the training set have shape: (1780, 1)
Tokenized labels of the validation set have shape: (445, 1)
I got the expected output in all previous parts, could anyone give me a hint? Thanks a lot!
I tried both fit_on_texts() and fit_on_sequences(), seems similar results.
Please click my name and message your notebook as an attachment in ipynb format.
In function tokenize_labels
, you are not using label_tokenizer
properly at fit_on_texts
and texts_to_sequences
(look for reference to global variable).
Thank you for your answer!
Do you mean the type of the labels should be numpy array? I tried to add this, but similar result. Is there any other reason? Thanks
Got it, found the reason! Thanks a lot!
Hi. I am having a similar issue as Florawang mentioned above. All previous functions return expected output, but tokenize_labels does not. Have tried a few different combinations but so far nothing helped.
First 5 labels of the training set should look like this:
[[87] [22] [40] [40] [74]]
Expected Output:
First 5 labels of the training set should look like this:
[[3] [1] [0] [0] [4]]
Not sure where to go next. Thanks.
Good afternoon. In the newer version of C3W2 assignment, I got an additional “None” label between the labels, when using StringLookup.
How do I remove that “None” label in position 0? What’s the reason for that? If someone could help me solving this, it would be great! Thank you.
Also kindly try to create a new topic whenever you encounter any issue even if you find a similar thread, creating a new topic provides you a better archive to your as well as other learner’s learning journey and to avoid confusion for future learner seeking help.
Regards
DP