Quandaries and Queries
 

 

Hi there,

I am currently teaching the coefficient of variation and am wondering if there are some guidelines as to the interpretation of this statistic. I understand that it measures the variation in a variable relative to the mean – but what is the cut off for “too much” variation expressed in this way???

Many thanks,
Jan
teacher
 

 

Hi Jan,

The coefficient of variation expresses the standard deviation as a percentage of the sample mean. This is useful when interest is in the size of variation relative to the size of the observation, and it has the advantage that the coefficient of variation is INDEPENDENT OF the UNITS of observation. For example, the value of the standard deviation of a set of weights will be different depending on whether they are measured in kilograms or pounds. The coefficient of variation, however, will be the same in both cases as it does not depend on the unit of measurement.

A value for saying there is too much variation seems to be subject dependent. In an Internet search we found three applications with different values.

An application to sports has the statement

The coefficient of variation of an individual athlete's performance is typically a few percent.

A site about llamas, Llamapaedia has a page on fiber testing which contains the sentence about the coefficient of variation

Twenty-five percent or less is desirable.

And a page Statistics for Microarray Analysis which says for their application the user can set the value and then gives examples of 3% and 5%.

Coefficient of Variation filter

The coefficient of variation filter is used to measure the consistency of the gene across all experiments. The coefficient of variation (CV) of each gene is calculated as standard deviation divided by mean. A high CV value reflects inconsistency among the samples within the group.
Usage

For a two groups comparison study as mentioned above, the CV of each group is calculated separately. The user predefines a cutoff value, where genes with CVs above the cutoff value are eliminated. The users have the option of defining different cutoff thresholds for each experimental group to control the consistency level of the genes. For example, the user could set the parameter to [0.03, 0.05] which would remove genes where the group1's CV is above 0.03, and group2's CV is about 0.05.

Andrei and Penny