Variance calculation for sample

The concepts are whether your set of data is the entire population, or if your set of data is a sample from a larger population.

The formulas to compute the variance are slightly different for the two cases.

The key reason the values are different is that if you have a small sample, but you want a very accurate result, you have to use “sampling with replacement”. That is, you pick a member of the sample at random, record its value, then put it back, and repeat this many times. If you didn’t replace the values after you pick them, you will run out of data in the sample.

The data proof runs quite some length, but you can find it in the Wikipedia article on “Variance” if you want the details.

The end result is that you divide by (N) if you are testing the whole population, but you divide by (N-1) if you only have a sample of the population.