Hi All, Is there a simple explanation why AICC reported by GLIMMIX when doing a simple regression changes with the scale of the regressor :
data test;
call streaminit(756623);
do x = 1 to 20;
x1000 = x * 1000;
x0001 = x * 0.001;
y = 2 * x + rand("NORMAL");
output;
end;
run;
proc glimmix data=test;
model y = x0001;
ods output FitStatistics=Fit_0001;
run;
proc glimmix data=test;
model y = x;
ods output FitStatistics=Fit_1;
run;
proc glimmix data=test;
model y = x1000;
ods output FitStatistics=Fit_1000;
run;
data FSall;
set Fit_0001 Fit_1 Fit_1000 indsname=source;
where descr =: "AICC";
from = source;
run;
proc print data=FSall noobs; run;
Descr Value from
AICC (smaller is better) 58.17 WORK.FIT_0001
AICC (smaller is better) 71.99 WORK.FIT_1
AICC (smaller is better) 85.80 WORK.FIT_1000
PG
This is because the default for GLIMMIX (and MIXED) is REML, restricted or residual maximum likelihood (METHOD=RSPL). This means that the fixed effects are removed before the likelihood is determined. By rescaling x, you are really fitting three different fixed effects models (as shown by the different scale of the slopes). You can't compare -2ll, AIC, or AICC for different fixed effect models with the REML estimation. If you add METHOD=MSPL (to achieve ML), then AICC is 66.25 for all three situations. The -2ll and other information metrics also agree. With ML, one can compare different fixed effect or random effect models.
This is because the default for GLIMMIX (and MIXED) is REML, restricted or residual maximum likelihood (METHOD=RSPL). This means that the fixed effects are removed before the likelihood is determined. By rescaling x, you are really fitting three different fixed effects models (as shown by the different scale of the slopes). You can't compare -2ll, AIC, or AICC for different fixed effect models with the REML estimation. If you add METHOD=MSPL (to achieve ML), then AICC is 66.25 for all three situations. The -2ll and other information metrics also agree. With ML, one can compare different fixed effect or random effect models.
Thanks a lot lvm. I will switch to MSPL. What are the drawbacks; there must be a good reason why REML is the default method.
PG
ML is biased, especially with small sample sizes. In the simplest possible case, the ML estimate of a one-sample variance involves dividing by n instead of n-1. Thus, variance estimates are biased, but the bias becomes quite small at large n. For unbalanced situations and other complexities, the bias can be seen in the fixed-effect parameters (expected values, slopes, etc.). Thus, it is much better to use REML, which is unbiased, or less biased. Hence, REML is the default. (There are other reasons which I won't get into). In GLIMMIX and MIXED, the primary purpose of AIC, etc., is in comparing models with different random effects, because often there is no a priori best choice for the random effects. One cannot compare directly an AIC from REML with an AIC from ML. If you do want to use information criteria to compare models with different fixed effects, then you must use ML estimation.
Others might not agree, but the bias of ML generally will not be large if the degrees of freedom are large.
Thank you again! That's most helpful. So, a decent strategy would be to chose a model using ML and refit it with REML to get better estimates? - PG
Henderson showed that the REML estimates were equivalent to Bayesian estimates, so REML is the bastard link between frequentists and Bayesians.
At least that's what I vaguely remember from a grad course over 30 years ago...
Anyway, choosing a model is always fraught with difficulties/drawbacks, but you are probably better off using ML (or in GLIMMIX, quasi-likelihood) to select amongst known competing models. My opinion is worth the $0.02 of electrons killed to present it.
Steve Denham
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.