{{Short description|Statistical measure of central tendency}} {{Refimprove|date=September 2009}}
A '''winsorized mean''' is a [[winsorising|winsorized]] [[statistical]] [[measure of central tendency]], much like the [[mean]] and [[median]], and even more similar to the [[truncated mean]]. It involves the calculation of the mean after [[winsorizing]] — replacing given parts of a [[probability distribution]] or [[Sampling (statistics)|sample]] at the high and low end with the most extreme remaining values,<ref>[[Yadolah Dodge|Dodge, Y]] (2003) ''The Oxford Dictionary of Statistical Terms'', OUP. {{ISBN|0-19-920613-9}} (entry for "winsorized estimation")</ref> typically doing so for an equal amount of both extremes; often 10 to 25 percent of the ends are replaced. The winsorized mean can equivalently be expressed as a [[weighted average]] of the truncated mean and the quantiles at which it is limited, which corresponds to replacing parts with the corresponding quantiles.
==Advantages==
The winsorized mean is a useful estimator because by retaining the [[outlier]]s without taking them too literally, it is less sensitive to observations at the extremes than the straightforward mean, and will still generate a reasonable estimate of central tendency or mean for almost all statistical models. In this regard it is referred to as a [[robust estimator]].
==Drawbacks==
The winsorized mean uses more information from the distribution or sample than the [[median]]. However, unless the underlying distribution is [[Symmetric probability distribution|symmetric]], the winsorized mean of a sample is unlikely to produce an [[unbiased estimator]] for either the mean or the median.
==Example==
For a sample of 10 numbers (from ''x''<sub>(1)</sub>, the smallest, to ''x''<sub>(10)</sub> the largest; [[order statistic]] notation) the 10% winsorized mean is
<math display="block">\frac{\overbrace{x_{(2)} + x_{(2)}} + x_{(3)} + x_{(4)} + x_{(5)} + x_{(6)} + x_{(7)} + x_{(8)} + \overbrace{x_{(9)} + x_{(9)}}}{10}. \, </math>
The key is in the repetition of ''x''<sub>(2)</sub> and ''x''<sub>(9)</sub>: the extras substitute for the original values ''x''<sub>(1)</sub> and ''x''<sub>(10)</sub> which have been discarded and replaced.
This is equivalent to a weighted average of 0.1 times the 5th percentile (''x''<sub>(2)</sub>), 0.8 times the 10% [[Truncated_mean|trimmed mean]], and 0.1 times the 95th percentile (''x''<sub>(9)</sub>).
==Notes== {{reflist}} {{inline|date=March 2012}} ==References== *{{cite journal|first1=R.R.|last1=Wilcox|first2=H.J.|last2=Keselman|title=Modern robust data analysis methods: Measures of central tendency|year=2003|journal=Psychological Methods|volume=8|pages=254–274|pmid=14596490|issue=3|doi=10.1037/1082-989X.8.3.254}}
[[Category:Means]] [[Category:Robust statistics]]
[[de:Mittelwert#Winsorisiertes und getrimmtes Mittel]]