I passed 12 tests, but failed 3 tests as shown in the attached screen. I checked my implementation and did not have any idea on how to change the loglikelihood dictionary. Any idea from anyone?
I have got to see you implementation of exercise 2 so maybe I can find a problem with it, send it to me on a private message!
Here are some problems:
Increment the number of positive words by the count for this (word, label) pair
the same for negative words! Not increment by 1 but by the count!
Here: Calculate D_pos, the number of positive documents  the comments tell you to use train_y!
In the last for loop, use directly the lookup function no need for assignments to variable and if conditions, just follow exactly the comments provided!
@gent.spah Thanks for advice. I have revised the code (attached). 12 tests passed. 3 tests failed (attached). If I take out if condition now in the last for loop, the program will fail with key error. I can take out later after I know my code error.
You are not allowed to post solutions publicly! I am telling you just use the lookup function directly as per the comments above, no need for ifโs there. Try to read the comments from scratch again!
I have read comments again and revised the code. 12 Tests passed. 3 Test failed. I do not have any idea to change the code further.
Itโs probably time to look at your code. We canโt do that on a public thread, but I just sent you a private message (DM) about how to proceed.
To close the loop on the public thread, there was a simple typo in one of the expressions that was causing the problems. Should be all sorted now!
can you DM the codes screenshot via personal DM to review where you might have gone wrong @xujinge
Hi @xujinge
Check your DM
Your codes for Calculation number of unique words, N_pos, N_neg, V_pos, V_neg, number of documents, positive and negative documents required you to use correct recall arguments as well as log prior was recalled incorrectly as the calculation of probability for positive and negative documents needed to be checked as
The prior probability represents the underlying probability in the target population that a tweet is positive versus negative. In other words, if we had no specific information and blindly picked a tweet out of the population set, what is the probability that it will be positive versus that it will be negative? That is the โpriorโ.
So in such condition while taking np.log of positive documents to total number of document  negative documents to total number of documents holds true for log prior calculation.
Regards
DP
Hello, Please assist. I am getting 9161 as length of Loglikelihood, instead of 9165
{moderator edit  solution code removed}
please follow community guidelines. Also seems like notebook assignment has kernel issue too. So I would first advise to get a fresh copy and then redo the assignment. Then only share the codes screenshot of the grade cell you are having issue by persona DM
Use edit pin option in your comment to remove the code images here @Ernest_Divine
I somehow missed seeing your DM, in case you are still stuck
Corrections required

While calculating V, you are using incorrect codes for vocabulary or dictionary recall, you need to use freq.keys()

To calculate number of documents, instructions given
Using the train_y input list of labels, calculate the number of documents (tweets) ๐ท
D, as well as the number of positive documents (tweets) ๐ท๐๐๐ and number of negative documents (tweets) ๐ท๐๐๐.
So use train_y.shape[0] rather than using len function for your labels
Now to find the positive and negative document using the above function recall, check labels with positive documents are assignment 1 and negative 0, so how would you recall them with condition?

Logprior would be then np log of positive documents to number of documents  np log of negative to number of documents

get the positive and negative frequency of the word, here use freq.get recall codes rather lookup(for auto grader to not fail your submission)
Let me know if you still have doubt.
Regards
DP
Thank you @Deepti_Prasad
Itโs been resolved.