Was a stupid indentation issue. Sorry! I am getting this output. All good looks like!
freq_pos for smile = 47
freq_neg for smile = 9
loglikelihood for smile = 1.5577981920239676
0.0
9165
Was a stupid indentation issue. Sorry! I am getting this output. All good looks like!
freq_pos for smile = 47
freq_neg for smile = 9
loglikelihood for smile = 1.5577981920239676
0.0
9165
Yes, those values agree with mine. Hope the grader also agrees!
Wrong number of keys in loglikelihood dictionary.
Expected: 9165.
Got: 11436
hey paul, can you help me i canât seem to find my mistake i got wrong value of loglikelihood.
this is my code:
{moderator edit - solution code removed}
The problem is the way you have defined the vocab
. You canât just take the set of freqs
, because those are tuples and they are all unique. Thatâs why you end up with too many entries. You need the unique words, which are the first entry in each of the tuples.
Also note that youâve got âorder of operationsâ issues on your p_w_pos and p_w_neg calculations. Try the following and watch what happens:
m = 5.
x = 1./1. + m
y = 1./(1. + m)
If youâre expecting x and y to have the same value, youâre in for a nasty surprise.
Oh, sorry, I didnât read every line: youâve also made the really classic mistake which most of the posts earlier on this thread are about. You just add 1 for each entry in the freqs dictionary, instead of using the actual frequency.
thanks i have corrected my vocab but still its showing 3 test failed i have tried for pair in freqs.keys():
{moderator edit - solution code removed}
but still its not working what is the problem in this code? its getting the total frequency of positive and negative words but still 3 tests are failed why?
If the loop is an enumeration of value
, I donât see value
used in the body of the loop. How is pair
defined with your code written that way?
With the enumeration over pair in freqs.keys()
, that loop body should work, but note that there are lots of other ways later in the code to go off the rails also. Did you fix the order of operations thing I pointed out?
I have solved the problem it was in freq_pos and negative as it was not counting the actual frequency of each word. thankyou so much for the time and effort paul.
Thank you. It helps me. After review the equation (4) and (5), I see that you are right. Numerator has freq number, so denominator should has total freq.
Hi Paul,
I have added these print statements and got the same output as yours but itâs still failing 3 test cases.
if word == âsmileâ:
print(V,D, D_pos, D_neg, N_pos, N_neg)
print(f"freq_pos for smile = {freq_pos}â)
print(f"freq_neg for smile = {freq_neg}â)
print(f"loglikelihood for smile = {loglikelihood[word]}")
output :
9165 8000 4000.0 4000.0 27547 27152
freq_pos for smile = 47
freq_neg for smile = 9
loglikelihood for smile = 1.5577981920239676
0.0
9165
9165 8000 4000.0 4000.0 27547 27152
freq_pos for smile = 47
freq_neg for smile = 9
loglikelihood for smile = 1.5577981920239676
Wrong values for loglikelihood dictionary. Please check your implementation for the loglikelihood dictionary.
9165 20 10.0 10.0 27547 27152
freq_pos for smile = 47
freq_neg for smile = 9
loglikelihood for smile = 1.5577981920239676
Wrong values for loglikelihood dictionary. Please check your implementation for the loglikelihood dictionary.
9165 15 10.0 5.0 27547 27152
freq_pos for smile = 47
freq_neg for smile = 9
loglikelihood for smile = 1.5577981920239676
Wrong values for loglikelihood dictionary. Please check your implementation for the loglikelihood dictionary.
12 Tests passed
3 Tests failed
Not sure what I am missing. Thanks for your all your posts.
Hi Tithi_Sarkar,
Did you manage to resolve this issue? If not, feel free to send me your notebook as an attachment to a direct message so I can have a look what is going on.
Hey sorry to pile on but I have a similar issue. I did the checks that Paul wrote and get the correct values for freq_pos and neg for smile as well as the correct values for N_pos and N_neg, however my loglikelihood is still incorrect and both my numerator and denominator use parentheticals so I donât think there is an order of operations issue in the p_w_pos and neg. Have any recommendations for other places to look?
Hi Elisa_Vera,
I find it hard to say without looking at your code. Feel free to send me your notebook as an attachment to a direct message. I can then have a look.
Hi Elisa_Vera,
Look carefully at the comment in train_naive_Bayes that states the following:
# calculate V, the number of unique words in the vocabulary
Can you see where the problem lies?
Thanks very much for watching this thread Reinoud! Sorry for my lack of response, but I was traveling the last 2 weeks and had a hard time keeping up on the forums.
*facepalm
Thank you!
No facepalm needed we all have bugs in our code. You are welcome.
No problem Paul. I hope you had a good time traveling.
LOL, well, @Vincent_Rupp when I read your reply to @paulinpaloalto I skip his explanation and tried to understand the sentence âN_pos isnât the total positive words; itâs the total frequency for all positive words.â and got confused. So I start looking back and forth on the quiz text and suddenly catch the error and now @paulinpaloalto 's explanation seems very easy to understand. As always, when you realized the error, you canât understand how you over look that. As explained by @paulinpaloalto the error is adding 1 instead of the times the word is in each tweet. Thank you both, because I spend all day today doing a bunch on debugging for around 4 hrs and couldnât find any error.
Thanks @paulinpaloalto I was commiting the same mistake.
I seems to be missing something, I am getting my V which is len(vocab) to be 9161 instead of 9165. I do not know where I made the mistake. Thanks in advance