C2W3 Assignment Exercise 7, 8, and 10

Hi, I am failing 3 tests and passing 9 for this exercise: preprocess_data. I tried using vocabulary as one of the arguments for replacing oov words, that passes matching the results for the first cell, but when I get to the testing cell it fails 3 tests.

Wrong number of unknown tokens in the test_data_replaced list. Check the unknown token value and how you are using it.
Expected: 4
Got: 0.
Wrong number of unknown tokens in the train_data_replaced list. Check the unknown token value and how you are using it.
Expected: 10
Got: 0.
Wrong number of unknown tokens in the test_data_replaced list. Check the unknown token value and how you are using it.
Expected: 7
Got: 0.
9 Tests passed
3 Tests failed

I have no idea what is wrong. I tried using unknown_token as well but that fails even more.
Any help is appreciated, thank you!

Hi @roses_r_red

For grader cell
The way you are using unknown token value might be incorrect,
UNQ_C6 GRADED_FUNCTION: replace_oov_words_by_unk

there is a code line statement

Check if the token is in the closed vocabulary
under this the second statement else need to add the unknown token using append
otherwise, append the unknown token instead

Regards
DP

Hi Deepti,

I think I did that implementation correctly. is my google colab for the same notebook. I also think that maybe I’m using the unknown value incorrectly. The code cells before exercise 7 have been producing correct results and passing the tests, though. So I’m not entirely sure where the hiccup may be. I’m backtracking to exercise 4, 5, and 6 and seeing if my counting or appending of unknown tokens are incorrect.

Ok kindly remove that link to your Google Colab, it is graded assignment and you are not suppose to share any such link which will grade your assignment on public post thread.

Please take screenshot of Exercise 7 grade cell code and DM me. Don’t send the whole notebook. Click on my name and then message.

Regards
DP

Hi @roses_r_red

I don’t know if it was a genuine mistake or you didn’t read the instructions in grade cell 7

check the below statement

For the train data, replace less common words with “< unk >” and

For the test data, replace less common words with “< unk >”

the above two statement mentions to replace the less common words with “< unk >”
but you have not added unknown_token=unknown_token to both data.

Regards
DP

Hi,

I guess I was careless? I had forgotten to input unknown_token into both data. Thank you for your hint!

1 Like

Hi @roses_r_red

I would take it as an honest mistake which can happen or has happened with me too!!!

happy to help!!!
Keep learning!!!

Regards
DP

2 Likes