Hi,
Below are the preprocessing, which we have to perform:
- Eliminate handles and URLs
- Tokenize the string into words.
- Remove stop words like “and, is, a, on, etc.”
- Stemming- or convert every word to its stem. Like dancer, dancing, danced, becomes ‘danc’. You can use porter stemmer to take care of this.
- Convert all your words to lower case.
My question is, As a best practice is there any order of above steps, which we can follow?
Thanks