How does assumption of independence among features hurt Naive Bayes?

Deependra_Singh1 · June 28, 2022, 3:24am

In the Naive Bayes Assumptions video, in order to explain how NB may calculate undesired values since features are assumed independent, instructor gave an example of predicting fill in the blanks

" It’s always cold and snowy in ______ "

He mentioned that NB can assign equal probabilities to possible answers such as spring/summer/fall/winter.

How does this deduction make sense ? Can anyone explain (if possible with eg:)

I personally think NB can accurately classify even such cases of fill-in-blanks

arvyzukai · June 28, 2022, 6:09am

Hi @Deependra_Singh1

I may try to explain my interpretation of the example:

Classical independence example is fair (when you know for certain that this coin is fair) coin flips - if you flipped heads now, it does not inform you in any way of the next flip.

Naive Bayes makes this independence assumption about words in language (which in reality is absolutely not the case) in order for the mathematical formulas to work out. And sometimes this approach works even though the assumption is wrong.

So when using this Naive (assuming words’ independence) approach, “cold” and “snowy” words are treated as if they are independent and contribute equally for the prediction (when in reality the word “snowy” should be enough to predict “winter” - the word “cold” should contribute less when there is word “snowy”, and not equally, because they are correlated.) Also in reality, the word “not” would change things dramatically if we were using word by word (unigram) approach (eg. “cold and not snowy”) but NB would just sum up the log likelihoods.

So my understanding is that depending on the Corpus, NB could assign equal probabilities to spring/summer/fall/winter (because of the math) but in reality it most probably would easily classify “winter” correctly.

Topic		Replies	Views
C3_W1 Naive Bayes Model Probability & Statistics for Machine Learning &... week-1	3	279	March 5, 2024
Baye's rule and Naive Bayes NLP with Classification and Vector Spaces week-2 , week-3	8	581	July 11, 2023
Have I correctly understood the Naïve Bayes' inference formula? NLP with Classification and Vector Spaces week-2	8	246	March 26, 2024
Naive Bayes_course needs to provide more information Probability & Statistics for Machine Learning &... week-1	1	20	December 5, 2024
Why relative frequencies are told to be a problem? NLP with Classification and Vector Spaces week-2 , week-3	3	514	October 17, 2022

How does assumption of independence among features hurt Naive Bayes?

Related topics