New Contributor
Posts: 3

# PROC ARIMA Variance Estimate Question

I'd appreciate if anyone can explain to me how SAS's PROC ARIMA calculates the Variance Estimate of the residual series?  I have not been able to replicate it.

From the SAS documentation:

The "Variance Estimate" is the variance of the residual series

SAS/ETS(R) 9.2 User's Guide

The results of my ARIMA fit produce the following:

 Constant Estimate 0.0238 Variance Estimate 0.00014 Std Error Estimate 0.011849 AIC -233.52 SBC -218.32 Number of Residuals 40

I would like to replicate the 0.00014 Variance Estimate.  To do so, I grab the 40 residuals from my ARIMA and run a variance calculation on them using query builder and it does not match.  I dump the data in excel and neither a population or sample variance produce the same result.  Can someone please explain to me what is happening with SAS's variance estimate?

Thank you.  (Residuals have been listed below)

0.0131739783

-0.001648352

0.0055679254

0.0033041448

-0.027348282

-0.018087132

-0.002878327

-0.004774205

0.0074996639

-0.012934558

-0.003643018

0.0033317717

0.0143422014

-0.002557585

-0.005269819

0.0097004778

0.0143494424

-0.022979957

0.0111023891

-0.008277071

-0.023881134

0.0024810503

-0.002777949

-0.007590526

0.0033606953

0.0099064146

-0.001947414

0.0028343645

0.0027540412

-0.003686471

-0.001714685

-0.014344516

0.0114885326

-0.001812683

0.0092582282

0.0058740413

0.0033302227

0.0009109483

0.0153236155

0.005019864

SAS Super FREQ
Posts: 4,240

## Re: PROC ARIMA Variance Estimate Question

My guess is that these residuals were copied from some ODS output?  The mean of these 40 numbers

is -0.000331, whereas true residuals would sum to zero (or numerically, to a smaller number such as 1e-16).

Create an output data set that contains the residuals and then use

PROC MEANS N var;

run;

I suspect (hope) that you will get a variance that agrees with PROC ARIMA.

New Contributor
Posts: 3

## Re: PROC ARIMA Variance Estimate Question

Hi Rick,

Thank you for your reply.  I used query builder to attempt the VAR and MEAN calculations.  The mean is -0.000330992 working directly from the actual ARIMA output and the var still does not equal the report from PROC ARIMA.

I've read up a bit more on ARIMA variance of residuals and believe this issue has more to do with the idea that a simple VAR of residuals is not appropriate for ARIMA models when there is autocorrelation in the original time-series.  So what we have here is Conditional Variance apparently.  So if possible, can someone please explain to me how to properly estimate the Conditional Variance of these residuals.....assuming that is what PROC ARIMA is doing?

Thanks again,

Greg

Posts: 2,655

## Re: PROC ARIMA Variance Estimate Question

Please don't consider this a smart-aleck answer, but the best way to calculate the variance of the residuals for an ARIMA process is to use PROC ARIMA.

I suppose that you could feed the raw data into PROC MIXED, and with appropriate coding of fixed effects, and much work with the covariance structures, get out an estimate.

I would not do it by hand, as there are too many matrix operations involved to get reasonable answers.

Steve Denham

New Contributor
Posts: 3

## Re: PROC ARIMA Variance Estimate Question

Thanks for the reply Steve.  Regardless of the complexity, I would still like to understand how this variance estimate is performed to ensure I can thoroughly explain the methodology behind the forecasts I am producing.  Are there a couple blogs or online papers someone could direct me to that provides an overview of the approach?

Posts: 2,655

## Re: PROC ARIMA Variance Estimate Question

Have you searched through the NIST statistics handbook?  It references Brockwell and Davis, 1991 Introrduction to Times Series and Forecasting, 2nd ed. as a source for the likelihood function used.  I would say to go there, and be prepared for the matrix algebra, if you want to really understand how those parameters are estimated.  I'll be honest--it would take me a week to begin to understand it, and I feel pretty comfortable with the matrix algebra involved with generalized linear mixed models.

Short answer: The value you are looking for is the square root of the variance component corresponding to the residual error for a non-linear optimization using the likelihood function for a Box-Jenkins process.