BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
FVidal
Fluorite | Level 6

Hello everybody,

 

I would like to know your opinion about one ARIMA model automatically created by SAS Forecast Studio.

The model just have the d argument informed and p and q are empty, like this:

p= empty

d= (1,12)

q= empty

 

Wich model we have? I understand that d=1 means that we have one order differencing, d=2 two orders differencing. But, whats the meaning of (1,12)?

And,..if we are just differencing...without p and q...wich is the model?

In fact the model that I get it's a good model and It reflects what I want but I'm not understanding how is doing it if I look at the parametres.

Can you give some light?

 

I'm using SAS Forecast Studio 12.1.

 

Thankyou,

Ferran

1 ACCEPTED SOLUTION

Accepted Solutions
alexchien
Pyrite | Level 9

The diff operation for dif(1, 12) looks like the following:

(1-B)(1-B_12)Y_t = Y_t - Y_t-1 - Y_t-12 + Y_t-13

 

The random walk model basically assumes the expected value of the diff is 0 (or a constant if you want to include a "drift" in the model). By setting the above formula to zero, you can derive the forecast formula as:

 

Y_t+1 = Y_t + Y_t-11 - Y_t-12

 

You can run the following code to check the math, and an Excel book is attached with the same data and forecasts.

 

data a;
input time_id Y;
cards;
1 104.1675397
2 100.5498722
3 102.6136901
4 103.5533417
5 101.7734511
6 102.8178815
7 104.6211812
8 102.5835074
9 102.4308517
10 101.3071798
11 101.9181746
12 100.5437124
13 101.3355998
14 104.2619803
15 103.777189
16 100.0892918
17 102.4089642
18 103.0160954
19 102.183431
20 100.4450315
21 101.9886783
22 100.4891325
23 100.7548746
24 102.4628036
;;
run;

proc arima data = a;
identify var =y(1 12);
estimate noint; /*no drift*/
forecast lead = 12 interval = obs id = time_id out = results;
run;
quit;

 

View solution in original post

7 REPLIES 7
alexchien
Pyrite | Level 9

If the time interval is monthly, d=12 basically means seasonal differencing. So the model d(1, 12) is basically a hybrid of random walk and seasonal random walk models.

FVidal
Fluorite | Level 6

Thank you alexchien,

 

I'm traying to reproduce this model en Excel. Do you know wich formula is using SAS to make the 'random walk' model?

 

Regards,

Ferran

alexchien
Pyrite | Level 9

The diff operation for dif(1, 12) looks like the following:

(1-B)(1-B_12)Y_t = Y_t - Y_t-1 - Y_t-12 + Y_t-13

 

The random walk model basically assumes the expected value of the diff is 0 (or a constant if you want to include a "drift" in the model). By setting the above formula to zero, you can derive the forecast formula as:

 

Y_t+1 = Y_t + Y_t-11 - Y_t-12

 

You can run the following code to check the math, and an Excel book is attached with the same data and forecasts.

 

data a;
input time_id Y;
cards;
1 104.1675397
2 100.5498722
3 102.6136901
4 103.5533417
5 101.7734511
6 102.8178815
7 104.6211812
8 102.5835074
9 102.4308517
10 101.3071798
11 101.9181746
12 100.5437124
13 101.3355998
14 104.2619803
15 103.777189
16 100.0892918
17 102.4089642
18 103.0160954
19 102.183431
20 100.4450315
21 101.9886783
22 100.4891325
23 100.7548746
24 102.4628036
;;
run;

proc arima data = a;
identify var =y(1 12);
estimate noint; /*no drift*/
forecast lead = 12 interval = obs id = time_id out = results;
run;
quit;

 

FVidal
Fluorite | Level 6

Alexchien,

 

Thankyou very much for the example. It can not be more clear! 🙂

 

Now, I'm able to understand and reproduce models with differenciation and also with autoregressive parameters(p).

Could you help me with some example with 'moving-average'(q) parameter? This part of an ARIMA model is always confuse to understand for me. Having an Excel example always helps to clarify it and believe in the models.

 

Thanks in advanced,

Ferran

alexchien
Pyrite | Level 9

The MA model has the following general form:

 

Y_t = C + (1 - m_1 B - m_2 B_2 - ....) e_t

 

If you fit a MA(1) model to the sample data provided, you will get the parameter estimates of 102.17485004 for the mean (C), and -0.041325847 for m_1. Thus the forecast formula is

Y_t =  102.17485004 + 0.041325847 e_t-1

 

So for the first historical period, the fitted value will be simply the mean 102.17485004 . To generate the 2nd period fitted value, you first compute e_1  = Y_1 - Y_1 hat = (104.1675397 - 102.17485004) = 1.99269. Based on the formula above, the fited value for the 2nd period is 102.25719963. Please note that e_t in the forecast lead periods are 0.

 

If you diff Y, just plug in the diff Y to the above formula. Please note that C is usually supressed if Y series is diffed (i.e. estimate NOINT).

 

I added 2 tabs to the excel book to demostrate how to generate forecasting with MA(1) and MA(2) models.

 

 

data a;
input time_id Y;
cards;
1 104.1675397
2 100.5498722
3 102.6136901
4 103.5533417
5 101.7734511
6 102.8178815
7 104.6211812
8 102.5835074
9 102.4308517
10 101.3071798
11 101.9181746
12 100.5437124
13 101.3355998
14 104.2619803
15 103.777189
16 100.0892918
17 102.4089642
18 103.0160954
19 102.183431
20 100.4450315
21 101.9886783
22 100.4891325
23 100.7548746
24 102.4628036
;;
run;

proc arima data = a;
identify var =y;
estimate q=1 outest = paramEst1;
forecast lead = 12 interval = obs id = time_id out = results1;
run;

 

estimate q=2 outest = paramEst2;
forecast lead = 12 interval = obs id = time_id out = results2;
run;
quit;

 

FVidal
Fluorite | Level 6

Hello alexchien,

 

Thanks again for you answers! I just have a doubt. You said the general formula for MA is:

 

Y_t = C + (1 - m_1 B - m_2 B_2 - ....) e_t

 

so, the formula for MA(1) should be, if I'm not wrong, like:

 

Y_t = C + e_t - m_1 e_t-1

 

And in your formula Y_t =  102.17485004 + 0.041325847 e_t-1 you didn't mentioned e_t.

Can we assume that e_t is equal 0? or we are assuming that C+e_t=mean?

 

Thanks!

Ferran

alexchien
Pyrite | Level 9

Hi Ferran,

You are correct that e_t is set to 0 when forecasting Y_t as 0 is the expected value of e's. e_t-1 won't be 0 as it will be the observed forecast error from the previous forecast (Actual_t-1 - Y_t-1). The e's in the forecast lead periods (in the future) will be 0's since no actuals are observed. You can see this in the excel book MA examples that the error's are set to 0 after period 24th.

thanks

Alex

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 1976 views
  • 2 likes
  • 2 in conversation