C1_W2, why would Naive Bayes perform poorly for the given dataset distribution


In the ungraded lab, it was said that the Naive Bayes would not perform so well for the given dataset distribution.
Here is what i think the reason is:
Since in Naive Bayes we’re basically learning the probability distribution of the inputs w.r.t classes and then we choose the class with the higher probability.
In the above cases since the two dataset have an overlaping region within
2*sigma for both class with lots of points inside it, so basically in that region if the positive class has higher probability say between (0.3-0.4) and negative class would also have a significant but lower probability range say (0.25-0.3) then inside that region we would always endup choosing positive class over negative, same case for negative class, and thus we would always ignore the other possible output which would reduce the accuracy since in that region there are a lot of points, while if there was no overlapping or if the number of points inside that region was low then such case would rarely arise.

Is my reasoning correct, or is there another reason for it?

Hi @God_of_Calamity

My understanding is that the image you posted indicates that the classes are not linearly separable (at least effectively) and we would need to transform the features (one feature or the other, or both) so that the accuracy would go up.
Compare the predictions:

  • the image you posted:
    image
  • and the original predictions:
    image

Note that in this case we would predict negative when the log_positive - log_negative < 0 (above the blue line) and positive when log_positive - log_negative > 0 (below the blue line).
But note that the model in your picture is correct only at the bottom left region where all the true positive cases (green dots) fall under the blue line while big majority of the data points would be predicted negative.

I’m not sure why course creators chose to artificially modify data this way, but I guess it was just for visualization purposes - to illustrate (no pun intended :slight_smile: ) that when you graph the data you can “see” the problems it could possibly have.

Cheers

2 Likes