{{Short description|Statistical accuracy measure}} The '''symmetric mean absolute percentage error''' ('''SMAPE''' or '''sMAPE''') is an accuracy measure based on percentage (or relative) errors. It is usually defined{{Citation needed|reason=S. Makridakis didn't use the following definition in his article ''Accuracy measures: theoretical and practical concerns,'' 1993.|date=May 2017}} as follows:

: <math> \text{SMAPE} = \frac{2}{n} \sum_{t=1}^n \frac{\left|F_t-A_t\right|}{|A_t|+|F_t|}</math>

where <math>A_t</math> are the actual values and <math>F_t</math> are the forecasted values. Note that if <math>A_t = F_t = 0</math>, then term <math>t</math> is undefined (<math>0/0</math>), and is usually ignored in the summation.

Explaining this equation in words, the absolute difference between ''A''<sub>''t''</sub> and ''F''<sub>''t''</sub> is divided by half the sum of absolute values of the actual value ''A''<sub>''t''</sub> and the forecast value ''F''<sub>''t''</sub>. The value of this calculation is summed for every fitted point ''t'' and divided again by the number of fitted points&nbsp;''n''.

== History ==

The earliest reference to a similar formula appears to be Armstrong (1985, p.&nbsp;348), where it is called "adjusted MAPE" and is defined without the absolute values in the denominator. It was later discussed, modified, and re-proposed by Flores (1986).

Armstrong's original definition is as follows:

: <math> \text{SMAPE} = \frac 1 n \sum_{t=1}^n \frac{\left|F_t-A_t\right|}{(A_t+F_t)/2}</math>

The problem is that it can be negative if <math>A_t + F_t < 0</math>. Therefore, the currently accepted version of SMAPE assumes the absolute values in the denominator.

== Discussion ==

=== Comparison with MAPE ===

The idea behind '''SMAPE''' is that over and under-forecasts are treated in a relative way, rather than an absolute way, as with the mean absolute percentage error ('''MAPE'''). For example, applying the formula above to some actual <math>A</math> and forecasted <math>F</math> values:

{| class="wikitable" |- ! <math>A</math> !! <math>F</math> !! MAPE !! SMAPE |- | 100 || 110 || 10% || 9.52% |- | 100 || 90 || 10% || 10.53% |}

we see that MAPE considers an over and underestimation of 10% as equivalent, whereas SMAPE considers the underestimation to be slightly "worse" than the overestimation.

Extending this to larger forecast errors:

{| class="wikitable" |- ! <math>A</math> !! <math>F</math> !! MAPE !! SMAPE |- | 100 || 200 || 100% || 66.67% |- | 100 || 50 || 50% || 66.67% |}

Here, ''double'' overestimation and ''half'' underestimation are treated equivalently by SMAPE, whereas MAPE considers the overestimation to be "twice as bad" as the underestimation.

Extending to an even more extreme case:

{| class="wikitable" |- ! <math>A</math> !! <math>F</math> !! MAPE !! SMAPE |- | 100 || 1,000 || 900% || 163.63% |- | 100 || 10 || 90% || 163.63% |}

Here it becomes clear that MAPE is unbounded from above, and can provide extremely large penalties for overestimations – but cannot do the same for extreme underestimations. SMAPE, on the other hand, is bounded between 0% and 200%, and penalises these larger over and underestimations in a more "symmetric" manner.

Therefore, the choice between MAPE and SMAPE depends entirely on the problem at hand, and whether or not a '''relative''' metric is more appropriate. This may be the case if the expected forecasting errors exceed <math>\gg10%</math>; for smaller errors, the MAPE is more frequently chosen, due to its simplicity and ease of interpretation.

=== Alternative Versions ===

As a "percentage error", SMAPE values between 0% and 100% can be considered easier to interpret, and an alternative formula is sometimes used in practice:

: <math> \text{SMAPE} = \frac{1}{n} \sum_{t=1}^n \frac{|F_t-A_t|}{|A_t|+|F_t|}</math>

There is also a third version of SMAPE, which allows measuring the direction of the bias in the data by generating a positive and a negative error on line item level. Furthermore, it is better protected against outliers and the bias effect{{clarify|date=October 2025}}. The formula is:

: <math> \text{SMAPE} = \frac{\sum_{t=1}^n \left|F_t-A_t\right|}{\sum_{t=1}^n (A_t+F_t)}</math>

== Alternatives ==

Provided the data are strictly positive, an alternative measure of relative accuracy can be obtained based on the log of the accuracy ratio: log(''F''<sub>''t''</sub> / ''A''<sub>''t''</sub>). This measure is easier to analyze statistically and has valuable symmetry and unbiasedness properties. When used in constructing forecasting models, the resulting prediction corresponds to the geometric mean (Tofallis, 2015) {{clarify|date=October 2025}}, whereas ordinary least squares models predict the arithmetic mean. The geometric mean is less affected by outliers than the arithmetic mean.

==See also== * Relative change and difference * Mean absolute error * Mean absolute percentage error * Mean squared error * Root mean squared error

{{No footnotes|date=August 2011}}

==References== * Armstrong, J. S. (1985) Long-range Forecasting: From Crystal Ball to Computer, 2nd. ed. Wiley. {{ISBN|978-0-471-82260-8}} * Flores, B. E. (1986) "A pragmatic view of accuracy measurement in forecasting", Omega (Oxford), 14(2), 93&ndash;98. {{doi|10.1016/0305-0483(86)90013-7}} * Tofallis, C (2015) "A Better Measure of Relative Prediction Accuracy for Model Selection and Model Estimation", Journal of the Operational Research Society, 66(8),1352-1362. [https://ssrn.com/abstract=2635088 archived preprint]

==External links== * [http://robjhyndman.com/hyndsight/smape/ Rob J. Hyndman: Errors on Percentage Errors] <!--* [http://www.monashforecasting.com/index.php?title=SMAPE More details on SMAPE] This is now broken too.--> <!--* [http://forecasters.org/pipermail/iif-discussion_forecasters.org/2008/000208.html Discussion on MAPE and SMAPE] The second link is broken 18/Nov/2009 if this is persistent, then remove this section-->

{{Machine learning evaluation metrics}}

Category:Statistical deviation and dispersion