Probability is area under the PDF curve

Why do we say probability of the continuous random variable is the area under the PDF curve?

I understand P(X=2) = 0, if X=amount of rain (in inches) on a particular day built basis historical data of rain on that specific date, as finitely we can’t be precise with the value. But can someone intuitively help me understand why we say probability is the area under the PDF? unlike to the discrete function whose exact value we know.

Link to the course video - https://www.coursera.org/learn/machine-learning-probability-and-statistics/lecture/194VO/probability-density-function.

Maybe this would be helpful.

1 Like

@VivekKapoor Personally, for myself at least, I don’t like to think of the probability of an exact event as being ‘zero’-- Rather sort of ‘infinitesimally small’.

I mean, really, the only way it is ‘zero’ is after the event has occurred. I mean if you are predicting total rain for the day in the next five minutes will be 2" total for the day, there is still some possibility that could be the case.

But after that five minutes has occurred, and you take a measurement, basically it is as if the probability distribution ‘collapses’; The event in question (2" of rain) either has occurred – It’s probability is now 1-- Or not occurred, the probability is zero, for that value and all other values that actually did not happen.

Recall probability is measured on a scale from 0 to 1, thus the total area under the curve is one. If we are talking a bell curve or standard normal, then the area around the mean/average is obviously more likely thus has a higher probability (more area, greater probability).

Whereas at the tails these are much more rare events, so the area under the curve there is very small.