C3_W1 Naive Bayes Algorithm Question

I am a bit confused and wonder if there is mistake in the bullet point here (C3_W1, section 2.2). It states that:

P(categorical x_k | C_i ) = \frac{\text{total number of samples in } X \text{ that have } x_k}{\text{ number of samples in } C_i}

But it possible that the number of samples in X that have x_k is greater than number of samples in C_i, which leads to a P > 1.

Is the correct division not:

\frac{\text{ number of samples in } C_i \text{ that have } x_k }{\text{ total number of samples in } C_i}

Apologies if I am misunderstanding.
Also, please could someone answer: Bayes Theorem - The Naive Bayes Model probability calculation - #10 by chris.favila

Hello @bishopb,

I have not read that material myself, but I think your understanding is correct. The vertical bar means “given”, so the probability is evaluated over samples of C_i only.

Cheers,
Raymond

Hey @bishopb,
Yes, you are correct in this. It should be “… is the number of samples in C_i that have attribute x_k …”. Let me pass this on to the team, so that it can be rectified.

Cheers,
Elemento

Hi!

Thanks for noticing this, you are correct. I have already fixed it. You may need to refresh your workspace to see the updated version.

Thanks!
Lucas

Many thanks, that is a relief as I was quite confused!