Hello!
My parse_data_from_file function does not return the expected 436 result.
My result looks like:
There are 2225 sentences in the dataset.
First sentence has 737 words (after removing stopwords).
There are 2225 labels in the dataset.
The first 5 labels are [‘tech’, ‘business’, ‘sport’, ‘sport’, ‘entertainment’]
I have separately tested remove_stopwords function and it did give 436 with the first sentence as seen in output[2]; I have also tried parsing the second sentence, and it returned 1431 after removing stopwords.
Thank you so much!