Have I correctly understood the Naïve Bayes' inference formula?

In the course titled ‘Log Likelihood, Part 1’ mentioned that:
image
This part of the formula, which I understand, adjusts for the effect of the total amount of positive and negative vocabulary to a certain extent rather than eliminating it entirely. To completely eliminate this effect, one would need to multiply it by the power of m.
image

Which course are you referring to? You posted in “AI Discussions”, which isn’t about any specific course.

I have moved it to the concern category

Natural Language Processing with Classification and Vector Spaces

ok,thanks

Hi @Micheal_Anderson,

Like to have a discussion with someone not from this course?

I have a different understanding about this formula that, the ratio that you put in the red rectangular box is not optionally eliminable. In fact, it is the one thing that makes it bayes. In other words, whenever you happen to not seeing the ratio in the Bayesian context, it was only because it was equal to 1.

In this lecture, what it seems looking for is the probability that the sentiment is positive given the words, then compare it with the other probability that the sentiment is negative given the same set of words.

So you are literally just constructing these two probabilities. Below is an example that uses both the Bayes rule and the independence assumption which is the one thing that makes it “Naive” (together, you have the Naive Bayes :wink: ).

You have this formula, and you get another formula in the same way for the case of sentiment=neg, and you divide the positive case by the negative case then you get the formula that says “if the ratio is larger than 1, the prob that it is positive given the words is larger”

Cheers,
Raymond

1 Like

@Micheal_Anderson, rather than how to eliminate it, what do you think that term in your red rectangular box contributes to, given that the term is indeed the “Prior” term of the Bayes rule?

Thank you very much! Now I realize that I missed this crucial parameter in the initial step of the Bayes calculation process.

1 Like

You are welcome, @Micheal_Anderson!

You said:

If you are interested, it is actually a good chance to review this statement given now that we know the part comes from the “Prior” term of the Bayes’ rule. For example, we know the prior should have nothing to do with the observational data we are putting into the “likelihood” term, because this is what the name of “prior” means - “prior to the observational data”. While your choice of the word - “adjusts” is still a very nice capture of what it can do, more can be elaborated out of that. :wink:

But this thing is totally optional. Glad to know that you have found something from my last reply.

Cheers,
Raymond

1 Like