Hello @NhomNhom
Review of your notebook
-
get_word_frequency
You have recalled number of emails with the wrong parameter, remember array of email is X and not Y(it is label corresponding to each email) -
prob_word_given_class code are perfectly right, so no correction required.
-
for prob_email_given_class, the below code is incorrect
Update the prob by multiplying it with P(word | class).
Don’t forget to add the word_frequency and class_frequency parameters!
prob *= word_frequency[word][cls]/class_frequency[cls]
HERE YOU ARE SUPPOSE TO USE THE PREVIOUSLY DEFINED GRADER CELL RECALL, i.e. use prob_word_given_class and include parameters of word, class and also notice the second statement in the instruction which mentions to add word_frequency and class_frequency parameters) -
def Naive Bayes
Compute P(ham) using the class_frequency dictionary and using the formula #ham emails / #total emails
YOU HAVE CALCULATED USING P(SPAM) FOR CALCULATING P(HAM)(CHECK THE NUMERATOR)
PLEASE MAKE SURE YOU HAVE RUN DOWN ALL THE CELLS FROM BEGINNING TILL END BEFORE YOU SUBMIT AN ASSIGNMENT.
Let me know if your issue is resolved.
Regards
DP