BookmarkSubscribeRSS Feed
LilyY_
Calcite | Level 5

Dear all,

Do we have to put binary variable into class statement in modeling procedures like proc mixed? I somehow remembered saw somewhere if a variable is a binary variable, it doesn't matter whether put it to class statement or not. But why I got very different results when the binary variable was in the class statement?

Should we always put all binary variables into class statement?

Thanks,

14 REPLIES 14
SteveDenham
Jade | Level 19

I would (almost) always include them in the CLASS statement.  To consider them as continuous variables (i.e., exclude from the CLASS statement) could lead to unusual interpretations, especially if you start to create ESTIMATE statements.  Also, it will enable you to get LSMEANS for each level of the variable.

Steve Denham

LilyY_
Calcite | Level 5

Thanks, Steve! I thought this won't change the results, but surprisely found the results changes significantly, not the binary variable itself, but the other variable has interaction term with it. why?

I have time, role, time*role.

if Role not in CLASS statement, I had those p_values from Type 3 tests: Time: p=.68, role: p=.25, time*role: p=.79

If Role in CLASS statement, Time: p=.04, role: p=.25, time*role:p=.79

Role and time*role didn't change, but time changed alot. why?


SteveDenham
Jade | Level 19

Hi Lily,

I think it will become apparent what is happening if you add the solution option to the model statement, but I could be wrong.  First, is time a categorical or a continuous covariate?  If continuous, then the latter values test the following: time - does the overall slope differ from zero, averaged over roles, role - do intercepts differ, time*role - do slopes differ for the two roles.  The former values all are tests of whether slopes differ from zero, and of deviations from a homogeneous slope (interaction), so you are really testing different hypotheses, resulting in different p values for the tests.

If time is a categorical variable, then I'm very surprised by the results.

Steve Denham

LilyY_
Calcite | Level 5

Hi Steve,

Time was continous in above example. Your explaination makes sense. But the attached output were from time as a categorical variables with/without put Role in CLASS statement. Any clue? Still only coefficient for Time changed. Role and Time*Role remained the same.

Role In CLASS Statement:

CLASS.jpg

Role Not in CLASS Statement:

NoCLASS.jpg

Thanks,


SteveDenham
Jade | Level 19

Hi Lily,

That's interesting.  The first thing I notice is that the standard errors for the time estimates are much smaller in the first picture.  This makes me think there are some interesting things going on elsewhere in the model.  Would it be possible to show all of your MIXED code?  In particular, I wonder about things like RANDOM and REPEATED elements. Something has to be absorbing the variability.  Correlations may also be involved.

Steve Denham

LilyY_
Calcite | Level 5

I have

proc mixed data=xxx ;

class ID Time; *or Class ID Time Role;

Model yy=time role time*role/s;

repeated time/type=cs sub=id;

run;

quit;

SteveDenham
Jade | Level 19

Try adding rcorr to the repeated statement, and then let's look at the correlation matrix under the two class systems.  I think things get "tighter" resulting in smaller standard errors.  The other part of the output to look at will be the covariance parameter estimates.

Steve Denham

Doc_Duke
Rhodochrosite | Level 12

Lily,

Contrary to what Steve said, I often include binary variables as continuous in analyses.  It can create some very nice shortcuts for summarizing a lot of data.  In previous versions of SAS, we had to do that in order to include the categorical predictor when there was no CLASS statement available (e.g. PROC LOGISTIC in version 6).  The binary variables had to be coded 0-1 for this to work.

I think that the reason you are getting the different estimates when you treat Role as continuous is that it is coded 1 and 2.  If you recode it as a 0-1 variable, you should get the same result for the model estimation with it as a class or continuous variable. 

Like Steve said, you will not be able to use some of the optional tools if it is treated continuous.

Doc Muhlbaier

Duke

LilyY_
Calcite | Level 5

Hi Dr. Muhlbaier,

I tried recode Role as 0,1. It didn't come with the same output as put it into CLASS statement.

p_values are: Time: 0.0024, Role1:0.0183, time*role: 0.6874

Still, p values for role and time*role remained the same, but Time is different.

correlation matrix and covariance parameter estimates for Role recoded as 0, 1 AND NOT in CLASS:

Cov and Rcorr No Class.jpg

Correlation Matrix and Covariance parameter estimates as Role in CLASS:

Cov and Rcorr CLASS.jpg

Role as 1,2 AND NOT in CLASS;

Cov and Rcorr No Class as 12.jpg

SteveDenham
Jade | Level 19

Well, it seems pretty obvious that the correlation matrix is identical under all of the parameterizations.  Time to look at the likelihood function.  How much does it change when you change ROLE between continuous and categorical?  If there is a big difference, then I get nervous that there is missing data or something that is causing cases to be excluded under one or the other situation.

Steve Denham

LilyY_
Calcite | Level 5

There are no differences. Role as continuous or categorical got all the same fit stats( -2 LL, AIC, AICC, BIC)

SteveDenham
Jade | Level 19

Something is causing the standard errors of the estimates to change radically, and if it is not the residual error (likelihood) or the correlations (RCORR) then I am out of the things that I know.  At this point, I think opening a track with SAS Technical Support is well worth your time.  When you get something from them, could you please post it here?  I am really curious, and I don't have a clue at this point as to what is causing this behavior.

Steve Denham

Reeza
Super User

You can also consider posting your code, you've only posted results so far.

But I would also contact Tech Support as well.

LilyY_
Calcite | Level 5

I have posted the code earlier (on#6). Here is the code

proc mixed data=xxx covtest ;

class ID Time; *or Class ID Time Role;

Model yy=time role time*role/s;

repeated time/type=cs sub=id rcorr;

run;

quit;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 14 replies
  • 6287 views
  • 0 likes
  • 4 in conversation