BookmarkSubscribeRSS Feed
Josephv
Calcite | Level 5

I am creating a mixed model using proc mixed, and have noticed that the Fit Statistics (AIC and log likelihood) results change depending on the reference values set in the class statement. I am using version 9.4.

 

For example, in the Ozone dataset (attached) each subject receives one of two treatments (exposed to regular air or air with ozone added) and has an outcome (FEV1) measured at serval times (0,1,2,3,4,5, and 6). In the table, subjects are identified by the column "ID", exposure is in the "ozone_ind column", where air is 0 and ozone is 1, and time and FEV1 are in "time" and "FEV1" respectively. All measures are formatted as numbers (as opposed to characters).

 

When I run the following code, I get a log-likelihood of 1837.4 and AIC of 1841.4

 

data ozone; set ozone; t=time; run;
proc mixed data=ozone method=reml;
     class id t ozone_ind(ref='0');
     model fev1 = ozone_ind time ozone_ind*time / s ddfm=KENWARDROGER;
     repeated t / type=ar(1) sub=id ; run;

 

 

However, with a reference option for t, I get a log-likelihood of 1909.6 and AIC of 1913.

 

data ozone; set ozone; t=time; run;
proc mixed data=ozone method=reml;
     class id t(ref='0') ozone_ind(ref='0');
     model fev1 = ozone_ind time ozone_ind*time / s ddfm=KENWARDROGER;
     repeated t / type=ar(1) sub=id ; run;

 

 
I've noticed that I can recreate the result of having no reference value by using  t(ref='6'). As default, I think SAS uses to the highest value for reference, so it makes sense that specifying t(ref='6') produces the same results as nothing. Can anyone explain why setting the ref value changes the AIC so much? Intuitively, setting to zero makes sense to me, but it seems to produce a worse fit, but how do I decide the best reference value? 
 
I saw a similar question was posed here, but in regard to proc phreg. Part of the answer there was to use PARAM=REF, but I've never done that with proc mixed, so I'm not sure if that is appropriate here or even possible. 
 
Best, 
Joseph V.
1 REPLY 1
Rick_SAS
SAS Super FREQ

Yes, the parameterization effects the parameter estimates, and interpretation gets messy when you have an interaction term, as you do.

PROC MIXED uses a GLM parameterization. A complete explanation is in the SAS/STAT documentation in the section "GLM Parameterization of Classification Variables and Effects."

 

As you say, the default reference values is REF=LAST, which corresponds to '6' in your example.  PROC MIXED does not support other parameterizations, so, for example, the procedure does not support the PARAM=REF option.

 

To use the log-likelihood to compare model fits, use the same parameterizations for all models. 

 

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 1 reply
  • 2144 views
  • 0 likes
  • 2 in conversation