In the chapter about anomaly detection, where Prof. Ng talks about the Gaussian distribution, what exactly does he mean by “the probability of x”?
When I hear the phrase “the probability of x” I’d usually think of the probability that a random number, according to the given probability distribution, is equal to x. In the case of the normal distribution this would always be equal to 0, which isn’t helpful.
In this context, he just seems to plug x into a density function of a probability distribution, and call it the probability of x. How does this make sense? And why would that be useful?
By looking at the probability distribution of X, we are able to assess if a particular value of X is highly probable or not.
As a simple example: if a value of x is highly probable (i.e., such a value of x happens many times and is not a rare occurence), then we don’t need to consider it as an anomaly. However, if a certain value of x is not that probable (i.e., such a value of x is a rare occurence) then the chances are higher for that to be representative of an anomaly.
@rmwkwok because the probabilities of all points have to add up to 1, but the normal distribution is given on uncountably infinitely many points, so they can’t have positive probabilities (or they would add up to infinity).
Ah, so I think @Elzear_Young you are arguing that the normal distribution is a probability density function, so without talking about a region of x, we can’t sum over it to get a probability mass.
We actually had that discussion before, and I would highly suggest you to perhaps go through that again or just read my conclusion. This link will bring you to my conclusion of the discussion but please start from the first post of the thread if you prefer to. It would take some time to read the whole thread but it would worth it if we had covered all your concerns.