C1_W2_Assignment_Exercise 2 - train_naive_bayes

vemula_suman · February 22, 2024, 3:47pm

hi ,
I am getting below error ,i could not understand ,what mistake i did.

Thank you

paulinpaloalto · February 22, 2024, 5:24pm

I added several print statements to instrument my train_naive_bayes function and here’s what I see when I run that test cell:

type(wordlist) <class 'list'>
V = 9165, len(wordlist) 11436
V: 9165, V_pos: 5804, V_neg: 5632, D: 8000, D_pos: 4000, D_neg: 4000, N_pos: 27547, N_neg: 27152
freq_pos for smile = 47
freq_neg for smile = 9
loglikelihood for smile = 1.5577981920239676
0.0
9165

The most common error on this function is simply to add one to N_pos and N_neg for each occurrence. The intent here is that you add the actual frequency each time by looking it up in the freqs dictionary that was passed in as an argument.

Of course there are other moving parts here, e.g. check the order of operations on your computation of p_w_pos and p_w_neg.

vemula_suman · February 22, 2024, 6:08pm

Dear Sir ,

I am getting same values
but getting same error

paulinpaloalto · February 22, 2024, 6:30pm

Yes, your values do all agree with mine. It’s probably time to look at your code. We aren’t supposed to do that in public on this thread, but we can do it by DM. I will send you a DM about how to do that.

anfagudelogo · April 6, 2024, 4:32pm

Hello! could you explain to me the concept of N_pos and N_neg?

paulinpaloalto · April 6, 2024, 4:47pm

We have a dictionary called freqs where the keys are (word, sentiment) where sentiment is either 1 for “positive” or 0 for “negative”. What that dictionary tells us is the total number of times a given word appears in a positive tweet and appears in a negative tweet.

In order to implement the Naive Bayes formula we need to know the total number of occurrences of unique words in positive tweets and negative tweets. That is what the numbers N_pos and N_neg represent: the total number of occurrences of positive words and negative words.

To compute those numbers we can loop over the keys in the freqs dictionary. Note that some words appear in both positive and negative tweets, but not all words do.

Eugene_2024 · April 19, 2024, 4:50pm

I have same problem - all my values is same, but I got this error:

Wrong values for loglikelihood dictionary. Please check your implementation for the loglikelihood dictionary.
Wrong values for loglikelihood dictionary. Please check your implementation for the loglikelihood dictionary.
Wrong values for loglikelihood dictionary. Please check your implementation for the loglikelihood dictionary.
12 Tests passed
3 Tests failed

paulinpaloalto · April 19, 2024, 4:55pm

Please add similar print statements to those I added earlier on this thread and show the results you get. Are you sure everything is the same?

Something must be different if you are failing the tests. The tests have been in use for years by now, so we would know if there were problems with the tests.

The most common error here is just adding one for each occurrence instead of using the actual frequencies from the freqs dictionary.

Eugene_2024 · April 19, 2024, 5:36pm

V: 9165, V_pos: 5804, V_neg: 5632, D: 8000, D_pos: 4000.0, D_neg: 4000.0, N_pos: 27547, N_neg: 27152
req_pos for smile = 47
freq_neg for smile = 9
loglikelihood for smile = 1.6529230243738393

Yes I see difference in loglikelyhood. I used just np.log(p_w_pos/p_w_neg) and got this value. I calculated the example manually and got your result. Ok. Thank you. I see the way but didn’t understand how this happen

Eugene_2024 · April 19, 2024, 6:13pm

How did you get this value?

Eugene_2024 · April 19, 2024, 6:18pm

p_w_pos = (47+1)/(27547+5804)
p_w_neg = (9+1)/(27152+5632)
np.log(p_w_pos/p_w_neg)

My result is 1.5514687524908015
Could you mark where can be mistake?

Upd. I found

paulinpaloalto · April 19, 2024, 7:49pm

Great! Yes, you have not translated the math formula correctly. Glad to hear you found the issue.

Mariam_Ali2 · April 20, 2024, 6:19pm

Hi i passed all test cells in programming assignment as shown below but when i submit the notebook i got 0 in two cells tests even it display on test cell function that i passed how to solve this

paulinpaloalto · April 20, 2024, 6:37pm

That is the next assignment, not Naive Bayes. But that syndrome usually means you have made the mistake of referencing the global variable word_embeddings in your get_country function.

Mariam_Ali2 · April 20, 2024, 6:52pm

this is the function i rewrite it more than one time and same error display
[Images removed by moderator as it contains grader cell codes and posting codes is against community guidelines]

paulinpaloalto · April 20, 2024, 7:28pm

Yes, you are making exactly the mistake that I described.

Mariam_Ali2 · April 20, 2024, 7:30pm

where is the mistake ?

paulinpaloalto · April 20, 2024, 7:35pm

I explained it earlier: it is a mistake to reference word_embeddings directly. It works in the notebook, but the grader passes a different dictionary. It is passed to you as an argument to the function, so the correct way is to use the argument value which is embeddings, right?

Mariam_Ali2 · April 20, 2024, 8:15pm

right i noticed it now thank you

Topic		Replies	Views
Train_naive_bayes NLP with Classification and Vector Spaces week-2 , week-3	9	662	June 27, 2023
C1_W2_Assignment key error NLP with Classification and Vector Spaces week-2	3	34	December 1, 2024
Doubt in Week 2 coding assignments NLP with Classification and Vector Spaces week-2	9	95	October 22, 2024
Can't get past UNC_C2: train_naive_Bayes NLP with Classification and Vector Spaces week-2 , week-3	8	680	December 19, 2022
I can't find any solution NLP with Classification and Vector Spaces week-2 , week-3	2	543	October 7, 2022

C1_W2_Assignment_Exercise 2 - train_naive_bayes

Related topics