NLP_C3W1_Practice Assignment - issue with build_vocabulary

I am appear to be doing something wrong with the build_vocabulary. My unit test passed but the grader is giving me the following error:
Failed test case: vocab does not contain all words.
Expected:
9535,
but got:
9517.

Any recommendations. It is really simple. I loop through each of the strings in teh corpus and then process each word in the tweet. If it is not in the vocab dictionary, I add it using an incremented counter. Any thoughts on what I could try? Everything else is passing with full credit.

Hi @lisestdenis,

Everything is pretty straightforward, as you described. Set the index to the length of the existing vocabulary, then iterate over the corpus. For every word in the tweet check if it is not in the vocabulary. If so, add it to the vocabulary and increment the index. Please feel free to DM me your code if you still need help with this function.

Hi @lisestdenis

This issue might not be from your side, this was reported as bug in March

I had notified l.t. of the course, seems like this still hasn’t been rectified.

But just to be sure, can you send me screenshot of the codes which failed the test on grader? Please click on my name and then message.

Regards
DP

1 Like

hi @lisestdenis

i really didn’t get what is curr_index? if this was to map all the words to an integer value to indexing, then this was not required as the previous step already did it.Also remember indexing is usually from 0 and not 1. So kindly remove that code line at both places.

For code line for word not in Corpus would be len function of vocab.

let me know if issue still persist.