Quandaries and Queries
I need to calculate the
standard deviation for a group of data, but I don't
know in advance what is the mean. Is there a way to adjust the STDV for
each datum without keeping all of the previous values? This is needed
basicaly for performance, so I won't need to read twice the same
data (spend processing time) nor save the previous values (spend memory).
Two us us disagree on what you are asking so we are going to answer both of our interpretations. Hopefully one of them is the question you want answered.
You have some data and you have calculated the standard deviation but you don't know the mean. You now have a new observation and you want to update the standard deviation to include this new observation. Can you calculate the new standard deviation without knowledge of the mean? The answer here is no.
You have some data and you want to calculate the standard deviation without calculating the mean first and then reading the data a second time. The answer here is yes as long as you can store three values while you are reading the data.
Suppose that the data set is x1, x2,..., xn then the variance is given by
and the standard deviation is the square root of the variance. Thus, after reading the data once, you can calculate the standard deviation if you know
Andrei and Penny
In March of 2004 we received the fololowing note from Britton.
Knuth attributes this method to B.P. Welford, Technometrics, 4,(1962), 419-420.