Only just started digging into SAS's guts and I'm having a mountain of problems with the ARIMA procedure.
I'm trying to use the two years in the attached dataset to predict the next 6 months. I figured out a way to calculate my MAPE but I'm getting numbers as low as 0.16 which if I'm to understand correctly, translates into an accuracy of 99.84. That just does not seem right to me. I've attached my dataset and my code as follows.
proc sgplot data = Items; series x = DATEPHYSICAL Y = QTY; run; quit; Proc gplot data=Items; plot DATEPHYSICAL * QTY; Run; Quit; Proc Transreg Data = Items; Model BOXCOX (QTY) = Identity(Datephysical); Run; Data ItemsLOG; Set Items; Log_QTY = log(QTY); Run; Proc gplot data=ItemsLOG; plot Datephysical * Log_Qty; Run; Quit; PROC ARIMA DATA= ItemsLOG; IDENTIFY VAR = Log_QTY STATIONARITY= (ADF) ; RUN; QUIT; PROC ARIMA DATA=ItemsLOG; IDENTIFY VAR = Log_QTY (1,6) STATIONARITY= (ADF) ; RUN; quit; Data Training Validation; Set ItemsLOG; If DATEPHYSICAL >= '31MAY2011'd then output Validation; Else output Training; Run; PROC ARIMA DATA= Training ; IDENTIFY VAR = Log_QTY(1,6) ; ESTIMATE P =1 Q =1 OUTSTAT= stats ; Forecast lead=6 interval = month id = DATEPHYSICAL out = result; RUN; Quit; /*RECOMENDED TO THE ONE JUST ABOVE*/ PROC ARIMA DATA= Training; IDENTIFY VAR = Log_QTY(1,6) MINIC; RUN; Quit; %Macro top_models; %do p = 0 %to 2 ; %do q = 0 % to 2 ; PROC ARIMA DATA= Training ; IDENTIFY VAR = Log_QTY(1,6) ; ESTIMATE P = &p. Q =&q. OUTSTAT= stats_&p._&q. ; Forecast lead=6 interval = month id = DATEPHYSICAL out = result_&p._&q.; RUN; Quit; data stats_&p._&q.; set stats_&p._&q.; p = &p.; q = &q.; Run; data result_&p._&q.; set result_&p._&q.; p = &p.; q = &q.; Run; %end; %end; Data final_stats ; set %do p = 0 %to 1 ; %do q = 0 % to 1 ; stats_&p._&q. %end; %end;; Run; Data final_results ; set %do p = 0 %to 1 ; %do q = 0 % to 1 ; result_&p._&q. %end; %end;; Run; %Mend; %top_models /* Then to calculate the mean of AIC and SBC*/ proc sql; create table final_stats_1 as select p,q, sum(_VALUE_)/2 as mean_aic_sbc from final_stats where _STAT_ in ('AIC','SBC') group by p,q order by mean_aic_sbc; quit; Proc SQL; create table final_results_1 as select a.p, a.q, a.DATEPHYSICAL,a.forecast, b.log_qty from final_results as a join validation as b on a.DATEPHYSICAL = b.DATEPHYSICAL; quit; Data Mape; set final_results_1 ; Ind_Mape = abs(log_qty - forecast)/ log_qty; Run; Proc Sql; create table mape as select p, q, mean(ind_mape) as mape from mape group by p, q order by mape ; quit;
Am I measuring my MAPE incorrectly or have I screwed up royally somewhere much earlier in my ARIMA code?
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.