BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Frances
Calcite | Level 5
Hello Forum,

I am using AIC to rank regression models from Proc Reg. It looks like SAS is using an incorrect value for the "K" term (number of estimable model parameters) in the AIC formula. According to the literature (e.g., D.R. Anderson & K.P. Burnham "Avoiding pitfalls when using information-theoretic methods", Journal of Wildlife Mgmt 66(3):912-918), when using AIC with least-squares regression, K is equal to the number of dep vars in the model + the intercept + the error term. When I calculate AIC "by hand" and compare it to the SAS value, it looks like SAS is not including the error term as one of the parameters (essentially using K-1).

Has anyone else noticed this or am I nuts? I'm still using SAS 9.1.3, so maybe this is not an issue in 9.2.

Thanks for any comments,
Frances
1 ACCEPTED SOLUTION

Accepted Solutions
Dale
Pyrite | Level 9
Frances,

What difference does it really make whether the error variance in an OLS model is included as a parameter when computing AIC? Suppose that you have models 1, 2, and 3, each with a different (non-nested) set of fixed-effect parameters fitted to the same set of observations. These models have error sums of squares SSE{1}, SSE{2}, and SSE{3}, and number of regression parameters p{1}, p{2}, and p{3} (where regression parameters include all beta_hat estimates).

Now, if we use the AIC values presented by PROC REG, we have

AIC{1a} = n * ln( SSE{1} /n ) + 2p{1}
AIC{2a} = n * ln( SSE{2} /n ) + 2p{2}
AIC{3a} = n * ln( SSE{3} /n ) + 2p{3}

Differences between AIC values for these models are:

AIC{1a} - AIC{2a} = n * ( ln( SSE{1}/n ) - ln( SSE{2}/n ) ) + 2(p{1} - p{2})
AIC{1a} - AIC{3a} = n * ( ln( SSE{1}/n ) - ln( SSE{3}/n ) ) + 2(p{1} - p{3})
AIC{2a} - AIC{3a} = n * ( ln( SSE{2}/n ) - ln( SSE{3}/n ) ) + 2(p{2} - p{3})

Alternatively, according to to Anderson and Burnham, you would compute

AIC{1a} = n * ln( SSE{1} /n ) + 2(p{1} + 1)
AIC{2a} = n * ln( SSE{2} /n ) + 2(p{2} + 1)
AIC{3a} = n * ln( SSE{3} /n ) + 2(p{3} + 1)

Differences between AIC values for these models are:

AIC{1a} - AIC{2a} = n * ( ln( SSE{1}/n ) - ln( SSE{2}/n ) ) + 2((p{1} + 1) - (p{2} + 1))
                          = n * ( ln( SSE{1}/n ) - ln( SSE{2}/n ) ) + 2(p{1} - p{2})

AIC{1a} - AIC{3a} = n * ( ln( SSE{1}/n ) - ln( SSE{3}/n ) ) + 2((p{1} + 1) - (p{3} + 1))
                          = n * ( ln( SSE{1}/n ) - ln( SSE{3}/n ) ) + 2(p{1} - p{3})

AIC{2a} - AIC{3a} = n * ( ln( SSE{2}/n ) - ln( SSE{3}/n ) ) + 2((p{2} + 1) - (p{3} + 1))
                          = n * ( ln( SSE{2}/n ) - ln( SSE{3}/n ) ) + 2(p{2} - p{3})


So, when you compare AIC values against one another, you obtain the same difference whether you do or do not include the variance estimate as one of the parameters. And since the difference between AIC values is unchanged according to whether or not you include the residual variance as a parameter, then it should not matter which form is employed. Can you provide an instance where it would make a difference in model comparisons whether you do or do not include the variance estimate among the parameters?

View solution in original post

10 REPLIES 10
deleted_user
Not applicable
Should be computing correctly. Are you accounting for categorical variables, and subtracting 1 param from the total number of classes (n) in each categorical variable, i.e., K = n-1 for each cat variable? Common mistake.
Frances
Calcite | Level 5
Thanks for your reply bgdphd.

There are no catagorical variables in the models. The models are OLS linear multiple regression models using continuous variables. Consider, for example, the following watershed-scale model:

Dissolved Nitrogen Concentration = intercept + (percent wetland) + (mean slope) + (watershed area) + error term

According to the reference I cited in the original question, K should be equal to the intercept term + the number of explanatory variables + the error term. For the example model above, K = 5; however, SAS appears to be using K = 4.

An exmaple of the code where I request the AIC value looks something like:
Proc reg data=test;
model nitrogen = wetland slope area / selection=adjrsq aic;
run;

Am I interpreting something wrong? Any comments appreciated.
Frances
Calcite | Level 5
Thought I would post an update. After more Google searching I did find some references that others have made to this problem. Apparently, in multiple linear regression, SAS does use a different value for K that is not consistent with the methodolgy given in Burnham & Anderson. In other cases, such as logistic regression, the computation of AIC is consistent.
It would be great if SAS would remedy this and provide an option to output AICc in Proc Reg as well.
So, be warned.
For a nice summary, see:
Joshua D. Stafford, Bronson K. Strickland, Potential inconsistencies when computing Akaikes information criterion. Bulletin of the Ecological Society of America. Volume 84, Issue 2 (April 2003) pp. 68-69.
Dale
Pyrite | Level 9
Frances,

I think you might prefer to use the MIXED procedure to fit models and obtain IC statistics. When using the MIXED procedure and estimation via maximum likelihood, AIC = -2LL + 2*(q + p) where q is the number of parameters in the covariance matrix and p is the number of parameters that are estimated as part of the model fixed effects.

Note, though, that if you use the MIXED procedure and use REML estimation, the AIC formula is AIC = -2LL + 2q. It should be observed that when estimating models employing REML, it is not appropriate to use likelihood-based statistics to select parameters which are among the model fixed effects. So the above computation of AIC for REML estimation is appropriate.
deleted_user
Not applicable
Ditto above. I recommend using METHOD=ML in place of default REML for parameter estimation, and in order to proceed with multi-model inference or comparison of nested candidate models using AIC. In your example, q=0, since you're not modeling covariance structure, and MIXED should compute AIC and other IC correctly.
Frances
Calcite | Level 5
Thanks for the advice bgdphd & Dale,

Now that I'm aware of the AIC calculation discrepency, I can fix the value in my code. I am not sure why I would use the MIXED procedure, however. I thought one typically chose LMM and GLMM methods if you were dealing with fixed and random effects? We are only dealing with fixed effects in our analysis. Any comments on this?
deleted_user
Not applicable
You can still proceed with MIXED.
deleted_user
Not applicable
the calculation formular of AIC is clearly described in the section of reg procedure of sas onlinehelp document, which is AIC = nlog(SSE/n)+2p, where p is the number of parameters including the intercept. p do not includes error terms.
deleted_user
Not applicable
from my limited knowledge about AIC, i do not think sas has made a careless mistake on its calculation. the calcualtion formular is right for least squares regression.
the following linkage describes the details of AIC
http://en.wikipedia.org/wiki/Akaike_information_criterion
Dale
Pyrite | Level 9
Frances,

What difference does it really make whether the error variance in an OLS model is included as a parameter when computing AIC? Suppose that you have models 1, 2, and 3, each with a different (non-nested) set of fixed-effect parameters fitted to the same set of observations. These models have error sums of squares SSE{1}, SSE{2}, and SSE{3}, and number of regression parameters p{1}, p{2}, and p{3} (where regression parameters include all beta_hat estimates).

Now, if we use the AIC values presented by PROC REG, we have

AIC{1a} = n * ln( SSE{1} /n ) + 2p{1}
AIC{2a} = n * ln( SSE{2} /n ) + 2p{2}
AIC{3a} = n * ln( SSE{3} /n ) + 2p{3}

Differences between AIC values for these models are:

AIC{1a} - AIC{2a} = n * ( ln( SSE{1}/n ) - ln( SSE{2}/n ) ) + 2(p{1} - p{2})
AIC{1a} - AIC{3a} = n * ( ln( SSE{1}/n ) - ln( SSE{3}/n ) ) + 2(p{1} - p{3})
AIC{2a} - AIC{3a} = n * ( ln( SSE{2}/n ) - ln( SSE{3}/n ) ) + 2(p{2} - p{3})

Alternatively, according to to Anderson and Burnham, you would compute

AIC{1a} = n * ln( SSE{1} /n ) + 2(p{1} + 1)
AIC{2a} = n * ln( SSE{2} /n ) + 2(p{2} + 1)
AIC{3a} = n * ln( SSE{3} /n ) + 2(p{3} + 1)

Differences between AIC values for these models are:

AIC{1a} - AIC{2a} = n * ( ln( SSE{1}/n ) - ln( SSE{2}/n ) ) + 2((p{1} + 1) - (p{2} + 1))
                          = n * ( ln( SSE{1}/n ) - ln( SSE{2}/n ) ) + 2(p{1} - p{2})

AIC{1a} - AIC{3a} = n * ( ln( SSE{1}/n ) - ln( SSE{3}/n ) ) + 2((p{1} + 1) - (p{3} + 1))
                          = n * ( ln( SSE{1}/n ) - ln( SSE{3}/n ) ) + 2(p{1} - p{3})

AIC{2a} - AIC{3a} = n * ( ln( SSE{2}/n ) - ln( SSE{3}/n ) ) + 2((p{2} + 1) - (p{3} + 1))
                          = n * ( ln( SSE{2}/n ) - ln( SSE{3}/n ) ) + 2(p{2} - p{3})


So, when you compare AIC values against one another, you obtain the same difference whether you do or do not include the variance estimate as one of the parameters. And since the difference between AIC values is unchanged according to whether or not you include the residual variance as a parameter, then it should not matter which form is employed. Can you provide an instance where it would make a difference in model comparisons whether you do or do not include the variance estimate among the parameters?

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 10 replies
  • 10031 views
  • 0 likes
  • 3 in conversation