Hi!
I am trying to use Interrupted time series with ARIMA model to compare before and after at intervention=45
data outcome;
input outcome time intervention time_af_int;
datalines;
9 1 0 0
9 2 0 0
10 3 0 0
8 4 0 0
8 5 0 0
6 6 0 0
6 7 0 0
13 8 0 0
20 9 0 0
23 10 0 0
29 11 0 0
34 12 0 0
19 13 0 0
39 14 0 0
44 15 0 0
29 16 0 0
34 17 0 0
62 18 0 0
50 19 0 0
46 20 0 0
51 21 0 0
36 22 0 0
42 23 0 0
48 24 0 0
30 25 0 0
64 26 0 0
66 27 0 0
77 28 0 0
54 29 0 0
74 30 0 0
48 31 0 0
52 32 0 0
73 33 0 0
77 34 0 0
83 35 0 0
55 36 0 0
48 37 0 0
48 38 0 0
47 39 0 0
44 40 0 0
49 41 0 0
64 42 0 0
35 43 1 1
77 44 1 2
46 45 1 3
58 46 1 4
55 47 1 5
70 48 1 6
41 49 1 7
56 50 1 8
45 51 1 9
57 52 1 10
62 53 1 11
51 54 1 12
76 55 1 13
58 56 1 14
46 57 1 15
71 58 1 16
62 59 1 17
64 60 1 18
59 61 1 19
54 62 1 20
70 63 1 21
54 64 1 22
65 65 1 23
52 66 1 24
56 67 1 25
70 68 1 26
71 69 1 27
70 70 1 28
60 71 1 29
;
run;
there are the steps I went through
1 check for stationarity by using PROC ARIMA
proc arima data=sample;
identify var=outcome stationarity=(adf);
run;
the results showed the outcome is stationarity
2. from ACF PACF plots the AR=2
my questions are
1.how I can estimate the coefficient of the model (b0,b1,b2,b3)with AR=2 if I used the linear regression
outcome=b0+b1*time+b2*intervention+b3*time_af_int
2. can I use the nonlinear regression with ARIMA?
3. how account for the seasonality?
I played with your data a little more. I think your series may have a break around time=36 rather than 45 as you thought. Based on the initial model, PROC ARIMA suggests two likely change points 26 and 36. I did another analysis using PROC SSM (could have used PROC UCM also), which can explicitly extract useful patterns that can be plotted. That analysis suggested break at 36. Here is the code for analysis based on PROC SSM:
/* Initial model */
proc ssm data=outcome breakpeaks like=marginal;
id time interval=obs;
trend level(ll) variance=0 checkbreak; /* Integrated random walk trend */
comp slope = (0 1)*level_state_;
irregular wn;
model outcome =level wn;
output out=ssmfor press;
run;
proc sgplot data=ssmfor;
scatter x=time y=outcome;
series x=time y=smoothed_level;
run;
proc sgplot data=ssmfor;
series x=time y=smoothed_slope;
run;
/* Model with shift at 36 */
proc ssm data=outcome breakpeaks like=marginal;
id time interval=obs;
shift = (time>=36);
trend level(ll) variance=0 checkbreak;
comp slope = (0 1)*level_state_;
irregular wn;
evaluate pattern = shift + level;
model outcome = shift level wn;
output out=ssmfor press;
run;
proc sgplot data=ssmfor;
scatter x=time y=outcome;
series x=time y=smoothed_pattern;
run;
proc sgplot data=ssmfor;
series x=time y=smoothed_slope;
run;
The SSM procedure syntax is a little more involved but it might be worth taking a look. Also take a look at the SGF paper on change point analysis:
http://support.sas.com/resources/papers/proceedings17/SAS0456-2017.pdf
Intervention analysis is a large area in time series analysis. You will need to explore literature on this topic (Box and Jenkins' book on time series analysis is a good start. Also see the documentation of the OUTLIER statement of PROC ARIMA). It is difficult to suggest an intervention analysis in each individual case. A quick check of your data suggests the following type of analysis (only a suggestion):
*A quick analysis code to get you started----------------*;
/*--Time series plot of your data that seems to suggest an ARIMA(0,1,1) model */
proc sgplot data=outcome;
series x=time y=outcome;
run;
proc arima data=outcome;
/* initial test model: ARIMA(0,1,1) */
identify var=outcome(1) noprint;
estimate q=1 noint method=ml;
outlier;
run;
/* ------------
1. residual plots show that initial model
appears adequate.
2. outlier output does not seem to indicate an outliers
around time 45
--------------*/
*Test whether intervention variable, time_af_int, improves
the model;
identify var=outcome(1) crosscorr=(time_af_int(1)) noprint;
estimate q=1 input=time_af_int noint method=ml;
outlier;
run;
*The coefficient of time_af_int appears insgnificant;
quit;
I played with your data a little more. I think your series may have a break around time=36 rather than 45 as you thought. Based on the initial model, PROC ARIMA suggests two likely change points 26 and 36. I did another analysis using PROC SSM (could have used PROC UCM also), which can explicitly extract useful patterns that can be plotted. That analysis suggested break at 36. Here is the code for analysis based on PROC SSM:
/* Initial model */
proc ssm data=outcome breakpeaks like=marginal;
id time interval=obs;
trend level(ll) variance=0 checkbreak; /* Integrated random walk trend */
comp slope = (0 1)*level_state_;
irregular wn;
model outcome =level wn;
output out=ssmfor press;
run;
proc sgplot data=ssmfor;
scatter x=time y=outcome;
series x=time y=smoothed_level;
run;
proc sgplot data=ssmfor;
series x=time y=smoothed_slope;
run;
/* Model with shift at 36 */
proc ssm data=outcome breakpeaks like=marginal;
id time interval=obs;
shift = (time>=36);
trend level(ll) variance=0 checkbreak;
comp slope = (0 1)*level_state_;
irregular wn;
evaluate pattern = shift + level;
model outcome = shift level wn;
output out=ssmfor press;
run;
proc sgplot data=ssmfor;
scatter x=time y=outcome;
series x=time y=smoothed_pattern;
run;
proc sgplot data=ssmfor;
series x=time y=smoothed_slope;
run;
The SSM procedure syntax is a little more involved but it might be worth taking a look. Also take a look at the SGF paper on change point analysis:
http://support.sas.com/resources/papers/proceedings17/SAS0456-2017.pdf
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.