04-26-2013 08:37 AM
Do we have to put binary variable into class statement in modeling procedures like proc mixed? I somehow remembered saw somewhere if a variable is a binary variable, it doesn't matter whether put it to class statement or not. But why I got very different results when the binary variable was in the class statement?
Should we always put all binary variables into class statement?
04-26-2013 08:55 AM
I would (almost) always include them in the CLASS statement. To consider them as continuous variables (i.e., exclude from the CLASS statement) could lead to unusual interpretations, especially if you start to create ESTIMATE statements. Also, it will enable you to get LSMEANS for each level of the variable.
04-26-2013 10:00 AM
Thanks, Steve! I thought this won't change the results, but surprisely found the results changes significantly, not the binary variable itself, but the other variable has interaction term with it. why?
I have time, role, time*role.
if Role not in CLASS statement, I had those p_values from Type 3 tests: Time: p=.68, role: p=.25, time*role: p=.79
If Role in CLASS statement, Time: p=.04, role: p=.25, time*role=.79
Role and time*role didn't change, but time changed alot. why?
04-26-2013 11:23 AM
I think it will become apparent what is happening if you add the solution option to the model statement, but I could be wrong. First, is time a categorical or a continuous covariate? If continuous, then the latter values test the following: time - does the overall slope differ from zero, averaged over roles, role - do intercepts differ, time*role - do slopes differ for the two roles. The former values all are tests of whether slopes differ from zero, and of deviations from a homogeneous slope (interaction), so you are really testing different hypotheses, resulting in different p values for the tests.
If time is a categorical variable, then I'm very surprised by the results.
04-26-2013 12:04 PM
Time was continous in above example. Your explaination makes sense. But the attached output were from time as a categorical variables with/without put Role in CLASS statement. Any clue? Still only coefficient for Time changed. Role and Time*Role remained the same.
Role In CLASS Statement:
Role Not in CLASS Statement:
04-26-2013 12:20 PM
That's interesting. The first thing I notice is that the standard errors for the time estimates are much smaller in the first picture. This makes me think there are some interesting things going on elsewhere in the model. Would it be possible to show all of your MIXED code? In particular, I wonder about things like RANDOM and REPEATED elements. Something has to be absorbing the variability. Correlations may also be involved.
04-26-2013 01:16 PM
Try adding rcorr to the repeated statement, and then let's look at the correlation matrix under the two class systems. I think things get "tighter" resulting in smaller standard errors. The other part of the output to look at will be the covariance parameter estimates.
04-26-2013 02:09 PM
Contrary to what Steve said, I often include binary variables as continuous in analyses. It can create some very nice shortcuts for summarizing a lot of data. In previous versions of SAS, we had to do that in order to include the categorical predictor when there was no CLASS statement available (e.g. PROC LOGISTIC in version 6). The binary variables had to be coded 0-1 for this to work.
I think that the reason you are getting the different estimates when you treat Role as continuous is that it is coded 1 and 2. If you recode it as a 0-1 variable, you should get the same result for the model estimation with it as a class or continuous variable.
Like Steve said, you will not be able to use some of the optional tools if it is treated continuous.
04-26-2013 04:16 PM
Hi Dr. Muhlbaier,
I tried recode Role as 0,1. It didn't come with the same output as put it into CLASS statement.
p_values are: Time: 0.0024, Role1:0.0183, time*role: 0.6874
Still, p values for role and time*role remained the same, but Time is different.
correlation matrix and covariance parameter estimates for Role recoded as 0, 1 AND NOT in CLASS:
Correlation Matrix and Covariance parameter estimates as Role in CLASS:
Role as 1,2 AND NOT in CLASS;
04-29-2013 09:34 AM
Well, it seems pretty obvious that the correlation matrix is identical under all of the parameterizations. Time to look at the likelihood function. How much does it change when you change ROLE between continuous and categorical? If there is a big difference, then I get nervous that there is missing data or something that is causing cases to be excluded under one or the other situation.
04-29-2013 11:00 AM
Something is causing the standard errors of the estimates to change radically, and if it is not the residual error (likelihood) or the correlations (RCORR) then I am out of the things that I know. At this point, I think opening a track with SAS Technical Support is well worth your time. When you get something from them, could you please post it here? I am really curious, and I don't have a clue at this point as to what is causing this behavior.
04-29-2013 12:35 PM
I have posted the code earlier (on#6). Here is the code
proc mixed data=xxx covtest ;
class ID Time; *or Class ID Time Role;
Model yy=time role time*role/s;
repeated time/type=cs sub=id rcorr;