Hi, I have a time series with it's prediction (attached). I calculate the SMAPE with:
cas session;
libname D cas caslib="CASUSER";
proc tsmodel
data=D.TESTSMAPE
outobj=(utlstat=D.OUTSTATS(replace=YES));
id DATE interval=day;
var MEASURED SARIMA;
require utl;
submit;
declare object utlstat(utlstat);
rc = utlstat.Collect(MEASURED, SARIMA);
endsubmit;
run;
In D.OUTSTATS I get that SMAPE = 719.77. That is very wierd, since the SMAPE metric is bounded between [0%,200%].
I tried the same calculation by hand using the standard SMAPE formula
100%/N * SUM [ |PREDICTED-MEASURED| / ((|MEASURED|+|PREDICTED|)/2) ]
and I get SMAPE = 119.43 which seems correct, as shown in the attached excel file.
So what is SAS doing? Where that 719.77 value comes from?!?
Thank you
Regards
Since you have opened Technical Support track regarding this question, I am posting below a summary of the response you already received from Technical Support just to close out this thread.
The TSMODEL Procedure calculates the absolute symmetric percent error(ASPE), and its mean, as originally proposed by Markakis and Armstrong. From the Statistics of Fit section of the documentation, the formula for ASPE is:
| 100 * ( SARIMA - MEASURED) / 0.5 * ( SARIMA + MEASURED) |
Rob Hyndman wrote an interesting article on this topic, Errors on percentage errors | Rob J Hyndman, in which he credits the wikipedia formula to Chen and Yang (2004).
While we don't have a copy of the Chen and Yang paper on hand to verify, R&D is considering adding the statistic to those provided by TSMODEL in the future.
Hello,
If you don't get a satisfactory answer on this very quick I would contact Technical Support in your local SAS office.
If what you report is really a problem in the software it will correctly and quickly stream through to R&D and it will be fixed.
Whatever you are doing wrong (or not), a value of > 700 should never be reported if the symmetric mean absolute percentage error (SMAPE or sMAPE) is bounded between 0% and 200%.
Do you have missing values in the series at the beginning of the data set? I remember about a problem with SMAPE and missing values at the beginning of the series but it was fixed in 2005!
Thanks,
Koen
Thank you, as you suggested I opened a ticket for Technical Support.
I thought as well that the problem may have something related to missing values, but after removing all missing values still the result is 719, so I guess the missing are handled correctly and the problem is not there.
I'll share useful news as soon as I have ones.
Regards
Since you have opened Technical Support track regarding this question, I am posting below a summary of the response you already received from Technical Support just to close out this thread.
The TSMODEL Procedure calculates the absolute symmetric percent error(ASPE), and its mean, as originally proposed by Markakis and Armstrong. From the Statistics of Fit section of the documentation, the formula for ASPE is:
| 100 * ( SARIMA - MEASURED) / 0.5 * ( SARIMA + MEASURED) |
Rob Hyndman wrote an interesting article on this topic, Errors on percentage errors | Rob J Hyndman, in which he credits the wikipedia formula to Chen and Yang (2004).
While we don't have a copy of the Chen and Yang paper on hand to verify, R&D is considering adding the statistic to those provided by TSMODEL in the future.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.