Solution of the optional assignment is flawed

The word_index still has 106 stop words.

stp_keys = [s for s in word_index if s in stopwords] 
print(len(stp_keys))
# 106

This is due to the filtration step:

sentences = []

labels = []

with open("./bbc-text.csv", 'r') as csvfile:

    ### START CODE HERE

    

    reader = csv.reader(csvfile, delimiter=',')

    next(reader)

    for row in reader:

        labels.append(row[0])

        sentence = row[1]

        for word in stopwords:

            token = " " + word + " "

            sentence = sentence.replace(token, " ")

            sentence = sentence.replace("  ", " ")

        sentences.append(sentence)

        

    ### END CODE HERE

sentence.replace(token, " ") fails to replace two consecutive stop words such as to to. Such pattern is quite common if many stop words has already been removed. For example, for sentence it was interesting to me to see kids ..., me is a stop word and is removed first, then it becomes it was interesting to to see kids ... and replace function cannot remove two to at the same time. So the last to remains in the sentence.

'it was interesting to to see kids ...'.replace(' to ', ' ')
# 'it was interesting to see kids ...'

One solution is to use re.sub instead:

import re
re.sub('\\bto\\b', '', 'it was interesting to to see kids ...')
# 'it was interesting   see kids ...'

That’s very true @Albert_Zhang

Well spotted!