In the Naive Bayes Assumptions video, in order to explain how NB may calculate undesired values since features are assumed independent, instructor gave an example of predicting fill in the blanks
" It’s always cold and snowy in ______ "
He mentioned that NB can assign equal probabilities to possible answers such as spring/summer/fall/winter.
How does this deduction make sense ? Can anyone explain (if possible with eg:)
I personally think NB can accurately classify even such cases of fill-in-blanks
Hi @Deependra_Singh1
I may try to explain my interpretation of the example:
Classical independence example is fair (when you know for certain that this coin is fair) coin flips - if you flipped heads now, it does not inform you in any way of the next flip.
Naive Bayes makes this independence assumption about words in language (which in reality is absolutely not the case) in order for the mathematical formulas to work out. And sometimes this approach works even though the assumption is wrong.
So when using this Naive (assuming words’ independence) approach, “cold” and “snowy” words are treated as if they are independent and contribute equally for the prediction (when in reality the word “snowy” should be enough to predict “winter” - the word “cold” should contribute less when there is word “snowy”, and not equally, because they are correlated.) Also in reality, the word “not” would change things dramatically if we were using word by word (unigram) approach (eg. “cold and not snowy”) but NB would just sum up the log likelihoods.
So my understanding is that depending on the Corpus, NB could assign equal probabilities to spring/summer/fall/winter (because of the math) but in reality it most probably would easily classify “winter” correctly.