I am not sure why we need posterior probability. Is it same that we say y hat = p(y=1)? If not, what is the difference between these two?

A prior probability simply means what is the probability of y being 1.

A posterior probability means given the data what is the probability of y being 1. It sort of says given the set of x values what is the probability of y being 1. In contrast a prior probability would mean what is the probability of any observation being 1, not the given observation.

that makes sense. Thank you!

Hi @xz90, and welcome to the DLS Specialization! The term *posterior* probability is not appropriate here, nor if *prior* probability. These concepts are part of Bayesian statistics, and are linked through Bayesâ€™ rule. We will not (explictly) apply Bayesian statistics in the Specialization.

The term that you are searching for is *conditional* probability, appropriate to probability and statistics, writ large (i.e., both *frequentist* and *Bayesian*). The highlighted equation is read â€śy-hat equals the probability that y is equal to one conditional on x being equal to x.â€ť The conditioning information x would be, for example, a particular example (e.g. an image).

Thank you very much! This is very helpful.