C2W3 Assignment Exercise 7, 8, and 10

roses_r_red · May 9, 2024, 12:56am

Hi, I am failing 3 tests and passing 9 for this exercise: preprocess_data. I tried using vocabulary as one of the arguments for replacing oov words, that passes matching the results for the first cell, but when I get to the testing cell it fails 3 tests.

Wrong number of unknown tokens in the test_data_replaced list. Check the unknown token value and how you are using it.
Expected: 4
Got: 0.
Wrong number of unknown tokens in the train_data_replaced list. Check the unknown token value and how you are using it.
Expected: 10
Got: 0.
Wrong number of unknown tokens in the test_data_replaced list. Check the unknown token value and how you are using it.
Expected: 7
Got: 0.
9 Tests passed
3 Tests failed

I have no idea what is wrong. I tried using unknown_token as well but that fails even more.
Any help is appreciated, thank you!

Deepti_Prasad · May 9, 2024, 4:53am

Hi @roses_r_red

For grader cell
The way you are using unknown token value might be incorrect,
UNQ_C6 GRADED_FUNCTION: replace_oov_words_by_unk

there is a code line statement

Check if the token is in the closed vocabulary
under this the second statement else need to add the unknown token using append
otherwise, append the unknown token instead

Regards
DP

roses_r_red · May 9, 2024, 4:57am

Hi Deepti,

I think I did that implementation correctly. is my google colab for the same notebook. I also think that maybe I’m using the unknown value incorrectly. The code cells before exercise 7 have been producing correct results and passing the tests, though. So I’m not entirely sure where the hiccup may be. I’m backtracking to exercise 4, 5, and 6 and seeing if my counting or appending of unknown tokens are incorrect.

Deepti_Prasad · May 9, 2024, 5:16am

Ok kindly remove that link to your Google Colab, it is graded assignment and you are not suppose to share any such link which will grade your assignment on public post thread.

Please take screenshot of Exercise 7 grade cell code and DM me. Don’t send the whole notebook. Click on my name and then message.

Regards
DP

Deepti_Prasad · May 9, 2024, 5:26am

Hi @roses_r_red

I don’t know if it was a genuine mistake or you didn’t read the instructions in grade cell 7

check the below statement

For the train data, replace less common words with “< unk >” and

For the test data, replace less common words with “< unk >”

the above two statement mentions to replace the less common words with “< unk >”
but you have not added unknown_token=unknown_token to both data.

Regards
DP

roses_r_red · May 9, 2024, 5:37am

Hi,

I guess I was careless? I had forgotten to input unknown_token into both data. Thank you for your hint!

Deepti_Prasad · May 9, 2024, 5:50am

Hi @roses_r_red

I would take it as an honest mistake which can happen or has happened with me too!!!

happy to help!!!
Keep learning!!!

Regards
DP

Topic		Replies	Views
C2W3 UNQ_C7 unittests failing NLP with Probabilistic Models week-3	5	423	July 20, 2023
Problem with Ex 7 and 10 in the final assignment NLP with Probabilistic Models week-3	7	818	September 20, 2022
Exercise 7 - preprocess_data is failing NLP with Probabilistic Models week-3	10	560	March 28, 2023
Replace token with <unk> test cases failing NLP with Probabilistic Models week-3	2	551	January 17, 2023
C2_W3 assignment: problem with preprocess_data function NLP with Probabilistic Models week-3	6	602	October 21, 2023

C2W3 Assignment Exercise 7, 8, and 10

Related topics