My name is Rod, and I am a 7th/8th grade math teacher. In our Prealgebra course, we have been studying Box and Whisker plots. Recently, we learned how to decide whether a data point is an outlier or not. The book (Math Thematics, McDougall Littell) gave a process by which we find the interquartile range, then multiply by 1.5. We add this number to the upper quartile, and any points above this are considered to be outliers. We also subtract the number from the lower quartile for the same effect.

My question: where does this 1.5 originate? Is this the standard for locating outliers, or can we choose any number (that seems reasonable, like 2 or 1.8 for example) to multiply with the Interquartile range? If it is a standard, were outliers simply defined via this process, or did statisticians use empirical evidence to suggest that 1.5 is somehow optimal for deciding whether data points are valid or not?

Thank you for any assistance that you can lend.

Hi Rod,

Box and Whisker plots were invented by John Tukey, a Statistician at Princeton University and Bell Telephone Laboratories. In his book Exploratory Data Analysis, published by Addison-Wesley in 1997, when introducing what he calls outside values he says "It is convenient to have a rule of thumb that picks out certain values as 'outside' or 'far out'." The outside values are what you are calling outliers and for far out values you multiply the interquartile range by 3 and then add or subtract from the quartiles. I cannot find any further justification for his choice of 1.5.

Cheers,
Penny
Go to Math Central