Hi, I have a time series with it's prediction (attached). I calculate the SMAPE with:
cas session;
libname D cas caslib="CASUSER";
proc tsmodel
data=D.TESTSMAPE
outobj=(utlstat=D.OUTSTATS(replace=YES));
id DATE interval=day;
var MEASURED SARIMA;
require utl;
submit;
declare object utlstat(utlstat);
rc = utlstat.Collect(MEASURED, SARIMA);
endsubmit;
run;
In D.OUTSTATS I get that SMAPE = 719.77. That is very wierd, since the SMAPE metric is bounded between [0%,200%].
I tried the same calculation by hand using the standard SMAPE formula
100%/N * SUM [ |PREDICTED-MEASURED| / ((|MEASURED|+|PREDICTED|)/2) ]
and I get SMAPE = 119.43 which seems correct, as shown in the attached excel file.
So what is SAS doing? Where that 719.77 value comes from?!?
Thank you
Regards
Since you have opened Technical Support track regarding this question, I am posting below a summary of the response you already received from Technical Support just to close out this thread.
The TSMODEL Procedure calculates the absolute symmetric percent error(ASPE), and its mean, as originally proposed by Markakis and Armstrong. From the Statistics of Fit section of the documentation, the formula for ASPE is:
| 100 * ( SARIMA - MEASURED) / 0.5 * ( SARIMA + MEASURED) |
Rob Hyndman wrote an interesting article on this topic, Errors on percentage errors | Rob J Hyndman, in which he credits the wikipedia formula to Chen and Yang (2004).
While we don't have a copy of the Chen and Yang paper on hand to verify, R&D is considering adding the statistic to those provided by TSMODEL in the future.
Hello,
If you don't get a satisfactory answer on this very quick I would contact Technical Support in your local SAS office.
If what you report is really a problem in the software it will correctly and quickly stream through to R&D and it will be fixed.
Whatever you are doing wrong (or not), a value of > 700 should never be reported if the symmetric mean absolute percentage error (SMAPE or sMAPE) is bounded between 0% and 200%.
Do you have missing values in the series at the beginning of the data set? I remember about a problem with SMAPE and missing values at the beginning of the series but it was fixed in 2005!
Thanks,
Koen
Thank you, as you suggested I opened a ticket for Technical Support.
I thought as well that the problem may have something related to missing values, but after removing all missing values still the result is 719, so I guess the missing are handled correctly and the problem is not there.
I'll share useful news as soon as I have ones.
Regards
Since you have opened Technical Support track regarding this question, I am posting below a summary of the response you already received from Technical Support just to close out this thread.
The TSMODEL Procedure calculates the absolute symmetric percent error(ASPE), and its mean, as originally proposed by Markakis and Armstrong. From the Statistics of Fit section of the documentation, the formula for ASPE is:
| 100 * ( SARIMA - MEASURED) / 0.5 * ( SARIMA + MEASURED) |
Rob Hyndman wrote an interesting article on this topic, Errors on percentage errors | Rob J Hyndman, in which he credits the wikipedia formula to Chen and Yang (2004).
While we don't have a copy of the Chen and Yang paper on hand to verify, R&D is considering adding the statistic to those provided by TSMODEL in the future.
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.