Quandaries and Queries
 

 

Hi, my name is Rosalie.  I have a question about sample variance.

To give an unbiased estimate of the population variance, the denominator of the sample variance should be (n-1) instead of n.  I tried to convince myself by comparing the population variance and sample variance (with denominator n and n-1):

Consider the population {1, 2, 3, 4, 5}

Population mean mu = (1+2+3+4+5)/5=3

Population variance sigma =[(1-3)2 + (2-3)2+ ··· +(5-3)2]/5 = 2

Now, all the possible samples of size n = 4 drawn from the above population are considered.

Sample

Sample Mean
xbar

sum(x-xbar)2 /n

sum(x-xbar)2 /(n-1)

{1, 2, 3, 4}

2.5

1.25

1.67

{1, 2, 3, 5}

2.75

2.1875

2.92

{1, 2, 4, 5}

3

2.5

3.3

{1, 3, 4, 5}

3.25

2.1875

2.92

{2, 3, 4, 5}

3.5

1.25

1.67

Mean

3

1.875

2.502

It looks like the mean of the variance calculated by dividing by n (1.875) gives a better estimate of the population variance (2).  Where did I go wrong?  Thanks for your help. 

 

 

Hi Rosalie,

The fact that the sample variance, calculated by dividing by n-1, is an unbiased estimator for the population variance is true if the population is infinite or you sample with replacement. In your example if you sample with replacement then you get 54 = 625 possible samples rather than the 5 you examined. Rather than doing an example that large I am going to repeat your calculation but using the population {1,2,3}and a sample of size 2.

Population mean mu = (1+2+3)/3 = 2

Population variance sigma =[(1-2)2 + (2-2)2+(3-2)2]/3 = 2/3

Now, all the possible samples of size n = 2 drawn from the above population with replacement are considered.

Sample

Sample Mean
xbar

sum(x-xbar)2 /n

sum(x-xbar)2 /(n-1)

{1, 1}

1

0

0

{1, 2}

1.5

0.25

0.5

{1, 3}

2

1

2

{2, 1}

1.5

0.25

0.5

{2, 2}

2

0

0

{2, 3}

2.5

0.25

0.5

{3, 1} 2 1 2
{3, 2} 2.5 0.25 0.5
{3, 3} 3 0 0
Mean 2 3/9 = 1/3 6/9 = 2/3

Penny

 
 

Go to Math Central