BookmarkSubscribeRSS Feed
sasalex2024
Quartz | Level 8

Dear SAS Community,

I have a time series variable, 'Yt', and the corresponding time variable, 'Date'. My ultimate goal is to fit an ARIMA model to Yt, so I need to determine the most appropriate transformation first. The 'proc transreg' documentation does not seem to include time series examples when using the 'identity' in the model part, so I am not sure if including ‘Date’ in identity is correct. Could you please confirm whether the following procedure is correctly specified for identifying the optimal Box-Cox transformation in a time series context? Or, perhaps I should use different procedure? Thank you.

proc Transreg Data = a;
Model BoxCox(Yt/lambda=-2 to 2 by 0.01 alpha=0.00001)=Identity(Date);
run;
quit;
6 REPLIES 6
WarrenKuhfeld
Ammonite | Level 13

I feel like I should say something more intelligent about this, after all I wrote PROC TRANSREG and the BOXCOX transformation, but there is nothing about what I did that had anything to do with ARIMA or time series.

SASCom1
SAS Employee

You may take a look at the %BOXCOXAR macro which is designed to find the optimal Box-Cox transformation for a time series:

 

SAS Help Center: BOXCOXAR Macro

 

 

I hope this helps. 

sasalex2024
Quartz | Level 8

Thank you for the reference, SASCom1.

I greatly appreciate it. I attempted to apply the boxcoxar macro to the sample sashelp.air dataset. After reviewing the manual, my understanding is that for this dataset, I should use the dif(1,12) option, as the raw Air data exhibits both trend and seasonal patterns. However, I am uncertain whether to use the default setting for the AR= option or if I should specify it manually. The manual states, "For a process with moving-average terms, a large value for the AR= option might be appropriate," but I am unclear on what constitutes a "large" value. Should I select an AR=p for the raw Air data to ensure the residuals of the fitted model are white noise? I am uncertain about the best approach to selecting these crucial AR options for this macro, especially since the manual states that the transformed series must be a stationary AR(p) process with white noise innovations. But the number of "optimal" lags for the raw Air data may not align with the number of optimal lags for the transformed Air. I use this code for now:

%BOXCOXAR(sashelp.air, Air, DIF=(1,12), AR=2, LAMBDALO=-2, LAMBDAHI=2, NLAMBDA=50,OUT=BoxCox);

 

SASCom1
SAS Employee

Hi @sasalex2024 ,

Large AR = option is recommended for a process with moving average terms because MA(q) process can be expressed as AR process of infinite order.  I don't know if there is a rule of thumb saying how large is large enough, but you may make the decision by considering the length of the data, the significance of the AR parameters, etc. 

I hope this helps.

sasalex2024
Quartz | Level 8

Thank you for your reply SASCom1. I don't know how to view the estimated AR coefficients after running BOXCOXAR. Is there an option I can use to can access these values? 

One other thing I am trying to understand is whether I need to check the appropriateness of AR terms after obtaining an optimal lambda estimate from BOXCOXAR. If I understood you correctly, you are suggesting the following approach:

 

  1. Pick a reasonable value for AR = n (say, AR = 5).
  2. Run BOXCOXAR to obtain the optimal value of lambda where AR=5.
  3. Manually construct the transformed series Yt from the original Xt series using the optimal lambda computed in step 2.
  4. Fit an AR(5) model to Yt using, for example, PROC ARIMA.
  5. Examine the significance of the AR terms in PROC ARIMA
  6. Repeat this process for other values of AR.

Is that correct? If so, this seems like a rather tedious process, and I am unsure if the process can be simplified. Because the manual explicitly states that AR model is fitted in BOXCOXAR to the transformed series, Yt, not the original series Xt. Thank you.

SASCom1
SAS Employee

The %BOXCOXAR macro does not output AR parameter estimates.

 

I do not know if there is a specific process that one must follow which is considered 'the correct' approach. When I posted my previous response, I honestly did not think of such process you outlined:-) My previous response was only meant for some things you may consider to help you decide what large number to specify for AR = option for a process with MA terms in using the %BOXCOXAR macro. My thought was simply this, if you fit an AR model, with the differencing applied, though not transformed, the significance of the AR parameters will give you some indication, may not be accurate indication, if certain lags are better included when you specify the AR = option. You can also try some different AR orders and see if and how %BOXCOXAR results change.