Only just started digging into SAS's guts and I'm having a mountain of problems with the ARIMA procedure.
I'm trying to use the two years in the attached dataset to predict the next 6 months. I figured out a way to calculate my MAPE but I'm getting numbers as low as 0.16 which if I'm to understand correctly, translates into an accuracy of 99.84. That just does not seem right to me. I've attached my dataset and my code as follows.
proc sgplot data = Items;
series x = DATEPHYSICAL Y = QTY;
run;
quit;
Proc gplot data=Items;
plot DATEPHYSICAL * QTY;
Run;
Quit;
Proc Transreg Data = Items;
Model BOXCOX (QTY) = Identity(Datephysical);
Run;
Data ItemsLOG;
Set Items;
Log_QTY = log(QTY);
Run;
Proc gplot data=ItemsLOG;
plot Datephysical * Log_Qty;
Run;
Quit;
PROC ARIMA DATA= ItemsLOG;
IDENTIFY VAR = Log_QTY STATIONARITY= (ADF) ;
RUN;
QUIT;
PROC ARIMA DATA=ItemsLOG;
IDENTIFY VAR = Log_QTY (1,6) STATIONARITY= (ADF) ;
RUN;
quit;
Data Training Validation;
Set ItemsLOG;
If DATEPHYSICAL >= '31MAY2011'd then output Validation;
Else output Training;
Run;
PROC ARIMA DATA= Training ;
IDENTIFY VAR = Log_QTY(1,6) ;
ESTIMATE P =1 Q =1 OUTSTAT= stats ;
Forecast lead=6 interval = month id = DATEPHYSICAL
out = result;
RUN;
Quit;
/*RECOMENDED TO THE ONE JUST ABOVE*/
PROC ARIMA DATA= Training;
IDENTIFY VAR = Log_QTY(1,6) MINIC;
RUN;
Quit;
%Macro top_models;
%do p = 0 %to 2 ;
%do q = 0 % to 2 ;
PROC ARIMA DATA= Training ;
IDENTIFY VAR = Log_QTY(1,6) ;
ESTIMATE P = &p. Q =&q. OUTSTAT= stats_&p._&q. ;
Forecast lead=6 interval = month id = DATEPHYSICAL
out = result_&p._&q.;
RUN;
Quit;
data stats_&p._&q.;
set stats_&p._&q.;
p = &p.;
q = &q.;
Run;
data result_&p._&q.;
set result_&p._&q.;
p = &p.;
q = &q.;
Run;
%end;
%end;
Data final_stats ;
set %do p = 0 %to 1 ;
%do q = 0 % to 1 ;
stats_&p._&q.
%end;
%end;;
Run;
Data final_results ;
set %do p = 0 %to 1 ;
%do q = 0 % to 1 ;
result_&p._&q.
%end;
%end;;
Run;
%Mend;
%top_models
/* Then to calculate the mean of AIC and SBC*/
proc sql;
create table final_stats_1 as select p,q, sum(_VALUE_)/2 as mean_aic_sbc from final_stats
where _STAT_ in ('AIC','SBC')
group by p,q
order by mean_aic_sbc;
quit;
Proc SQL;
create table final_results_1 as select a.p, a.q, a.DATEPHYSICAL,a.forecast, b.log_qty
from final_results as a join validation as b
on a.DATEPHYSICAL = b.DATEPHYSICAL;
quit;
Data Mape;
set final_results_1 ;
Ind_Mape = abs(log_qty - forecast)/ log_qty;
Run;
Proc Sql;
create table mape as select p, q, mean(ind_mape) as mape from mape
group by p, q
order by mape ;
quit;
Am I measuring my MAPE incorrectly or have I screwed up royally somewhere much earlier in my ARIMA code?
Hi Demios,
MAPE of 0.16 means 16% of error, not 0.16% error. So the accuracy should be around 84%.
thanks
Alex
Hi Demios,
MAPE of 0.16 means 16% of error, not 0.16% error. So the accuracy should be around 84%.
thanks
Alex
Gosh, I feel like a dolt for that. The accuracy is still MUCH higher than I would expect at 84%, but that is definitely less heart attack inducing than getting a 99% accuracy. Thanks a ton.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.