Smoothed probability for Naive Bayes

Anivader · November 4, 2022, 8:15pm

Hello,

Can someone offer an explanation (preferably proof) as to why the 2nd term in the denominator should be “V_class”, which is the number of unique words in the vocabulary ?

Here’s the equation -
P(w_i/class) = (freq(w_i, class) + 1)/(N_class + V_class)

Any insight would be helpful.

Thanks
Ani

Anivader · November 4, 2022, 8:22pm

Okay I think I got it. You are essentially adding “1” to every word in the probability table so that just gives an additional value of "V_class " over all the rows.

Topic		Replies	Views
Doubt in Week 2 coding assignments NLP with Classification and Vector Spaces week-2	9	108	October 22, 2024
Week 2, video: Training Naïve Bayes NLP with Classification and Vector Spaces week-2 , week-3	2	514	November 29, 2022
Possible Issue with Emission Matrix Formula NLP with Probabilistic Models week-2	4	552	August 5, 2024
C3W1_Assignment excercise 3 Probability & Statistics for Machine Learning &... how-to-forum	3	157	June 17, 2024
C3 W1 A1 Ex 10: probability of spam given text Probability & Statistics for Machine Learning &... week-1	7	506	July 13, 2023

Smoothed probability for Naive Bayes

Related topics