C1_W1: Erro in frequency count part 2

Nguy_n_Nhi · July 26, 2023, 9:05am

Continuing the discussion from Error in frequency count:

Issues happend in the frequency table generated by build_freqs, when compared my results to another results have the same issues at Apr.11,

Continuing the discussion from Error in frequency count:

i see this had changed. I try to test this function many time, do the same as hints but still didn’t get the right answer. I have not edited the utils.py file and everything up to this point works well, but it does cause me issues later on.

gent.spah · July 26, 2023, 9:42am

Send me in private the code for extract_features, the error should be there, i will have a look on it, because i have no idea where the error might be otherwise.

gent.spah · July 26, 2023, 12:26pm

It is because you have not followed the instructions properly:

Implement the extract_features function.

This function takes in a single tweet.
Process the tweet using the imported process_tweet function and save the list of tweet words.
Loop through each word in the list of processed words
- For each word, check the ‘freqs’ dictionary for the count when that word has a positive ‘1’ label. (Check for the key (word, 1.0)
- Do the same for the count for when the word is associated with the negative label ‘0’. (Check for the key (word, 0.0).)

Note: In the implementation instructions provided above, the prediction of being positive or negative depends on feature vector which counts-in duplicate words - this is different from what you have seen in the lecture videos

First the x is not 2 dimensional but 1 dimensional of shape 1x3.

Second, you use word, 1.0- floating number

Third, the batch dimension is added at the end (already done for you) but you removed it.

Nguy_n_Nhi · July 27, 2023, 1:12pm

thank you so much, i had passed it.
But i code the same images that i send you and no errors appears again, i dont know why

Qianchi_Zhou · August 6, 2023, 3:06pm

Hi. I’m having similar problem with this part. The testing tweet word list after processing seems to be [‘followfriday’, ‘top’, ‘engag’, ‘member’, ‘commun’, ‘week’, ‘:)’] which has no negative word according to freqs dict. So I got [[1.000e+00 3.133e+03 0.000e+00]]

after [23] test which is different with the expect output [[1.000e+00 3.133e+03 6.100e+01]]
.

I was using tuple[str, float] key to select freqs dict but i think it has nothing to do with the results since there is no negative words in the tweet list, if i didnt make other mistakes.

Thanks for help.

Qianchi_Zhou · August 6, 2023, 3:10pm

I printed out all word records for debug:

Topic		Replies	Views
Error in frequency count NLP with Classification and Vector Spaces week-1	2	507	April 18, 2023
C1W1 - frequency extraction discrepancy between explanation and implementation NLP with Classification and Vector Spaces week-1	3	44	February 18, 2025
Assignment inconsistent with course video: Frequencies for unique words or not? NLP with Classification and Vector Spaces week-1	4	493	April 7, 2023
C1_W1_Assignment - Natural Language Processing with Classification and Vector Spaces NLP with Classification and Vector Spaces week-1	2	581	October 19, 2022
C1_W1_Assignment's word frequencies NLP with Classification and Vector Spaces	3	313	January 2, 2024

C1_W1: Erro in frequency count part 2

Related topics