 Quandaries and Queries

Hi, my name is Rosalie.  I have a question about sample variance.

To give an unbiased estimate of the population variance, the denominator of the sample variance should be (n-1) instead of n.  I tried to convince myself by comparing the population variance and sample variance (with denominator n and n-1):

Consider the population {1, 2, 3, 4, 5}

Population mean mu = (1+2+3+4+5)/5=3

Population variance sigma =[(1-3)2 + (2-3)2+ ··· +(5-3)2]/5 = 2

Now, all the possible samples of size n = 4 drawn from the above population are considered.

 Sample Sample Mean xbar sum(x-xbar)2 /n sum(x-xbar)2 /(n-1) {1, 2, 3, 4} 2.5 1.25 1.67 {1, 2, 3, 5} 2.75 2.1875 2.92 {1, 2, 4, 5} 3 2.5 3.3 {1, 3, 4, 5} 3.25 2.1875 2.92 {2, 3, 4, 5} 3.5 1.25 1.67 Mean 3 1.875 2.502

It looks like the mean of the variance calculated by dividing by n (1.875) gives a better estimate of the population variance (2).  Where did I go wrong?  Thanks for your help.

Hi Rosalie,

The fact that the sample variance, calculated by dividing by n-1, is an unbiased estimator for the population variance is true if the population is infinite or you sample with replacement. In your example if you sample with replacement then you get 54 = 625 possible samples rather than the 5 you examined. Rather than doing an example that large I am going to repeat your calculation but using the population {1,2,3}and a sample of size 2.

Population mean mu = (1+2+3)/3 = 2

Population variance sigma =[(1-2)2 + (2-2)2+(3-2)2]/3 = 2/3

Now, all the possible samples of size n = 2 drawn from the above population with replacement are considered.

 Sample Sample Mean xbar sum(x-xbar)2 /n sum(x-xbar)2 /(n-1) {1, 1} 1 0 0 {1, 2} 1.5 0.25 0.5 {1, 3} 2 1 2 {2, 1} 1.5 0.25 0.5 {2, 2} 2 0 0 {2, 3} 2.5 0.25 0.5 {3, 1} 2 1 2 {3, 2} 2.5 0.25 0.5 {3, 3} 3 0 0 Mean 2 3/9 = 1/3 6/9 = 2/3

Penny

Go to Math Central