Winsorized mean

A '''winsorized mean''' is a [[winsorising|winsorized]] [[statistical]] [[measure of central tendency]], much like the [[mean]] and [[median]], and even more similar to the [[truncated mean]]. It involves the calculation of the mean after [[winsorizing]] — replacing given parts of a [[probability distribution]] or [[Sampling (statistics)|sample]] at the high and low end with the most extreme remaining values,<ref>[[Yadolah Dodge|Dodge, Y]] (2003) ''The Oxford Dictionary of Statistical Terms'', OUP. {{ISBN|0-19-920613-9}} (entry for "winsorized estimation")</ref> typically doing so for an equal amount of both extremes; often 10 to 25 percent of the ends are replaced. The winsorized mean can equivalently be expressed as a [[weighted average]] of the truncated mean and the quantiles at which it is limited, which corresponds to replacing parts with the corresponding quantiles.

==Advantages==

The winsorized mean is a useful estimator because by retaining the [[outlier]]s without taking them too literally, it is less sensitive to observations at the extremes than the straightforward mean, and will still generate a reasonable estimate of central tendency or mean for almost all statistical models. In this regard it is referred to as a [[robust estimator]].

==Drawbacks==

The winsorized mean uses more information from the distribution or sample than the [[median]]. However, unless the underlying distribution is [[Symmetric probability distribution|symmetric]], the winsorized mean of a sample is unlikely to produce an [[unbiased estimator]] for either the mean or the median.

==Example==

For a sample of 10 numbers (from ''x''(1), the smallest, to ''x''(10) the largest; [[order statistic]] notation) the 10% winsorized mean is

<math display="block">\frac{\overbrace{x_{(2)} + x_{(2)}} + x_{(3)} + x_{(4)} + x_{(5)} + x_{(6)} + x_{(7)} + x_{(8)} + \overbrace{x_{(9)} + x_{(9)}}}{10}. \, </math>

The key is in the repetition of ''x''(2) and ''x''(9): the extras substitute for the original values ''x''(1) and ''x''(10) which have been discarded and replaced.

This is equivalent to a weighted average of 0.1 times the 5th percentile (''x''(2)), 0.8 times the 10% [[Truncated_mean|trimmed mean]], and 0.1 times the 95th percentile (''x''(9)).

==Notes== {{reflist}} {{inline|date=March 2012}} ==References== *{{cite journal|first1=R.R.|last1=Wilcox|first2=H.J.|last2=Keselman|title=Modern robust data analysis methods: Measures of central tendency|year=2003|journal=Psychological Methods|volume=8|pages=254–274|pmid=14596490|issue=3|doi=10.1037/1082-989X.8.3.254}}

[[Category:Means]] [[Category:Robust statistics]]

[[de:Mittelwert#Winsorisiertes und getrimmtes Mittel]]