Re: Mixed model with random and repeated statement

fdehkord · Posted 02-16-2022 02:54 PM

I have fitted a model using proc mixed once with Random statement and once with with repeated statement, then fitted the same model using proc genmode. I expected to see similar results. why variable outcome is significant when I use Genmod and not significant When I use Proc MIxed with Random statement and borderline significant when I use Proc mixed with repeated statement. Y is measured three time on each subject , outcome is a binary variable (Yes, NO), and Time (1,2,3) refers the ordering of the measurement time.

proc mixed data = mine;

class outcome id time;

model y = outcome time /ddfm=kr2 solution;

random int /type=un subject =Id;

run;

proc mixed data = mine;

class outcome id time;

model y = outcome time /ddfm=kr2 solution;

repeated /type =un subject =id;

run;

PROC GENMOD DATA =mine;

CLASS id time outcome;

MODEL y= outcome time;

REPEATED SUBJECT=id / TYPE=UN CORRW; RUN;

StatDave · Posted 02-16-2022 05:35 PM

PROC GENMOD with the REPEATED statement fits a Generalized Estimating Equations (GEE) model using a GEE algorithm. PROC MIXED with the REPEATED statement fits the model by maximum likelihood. The different fitting methods can be expected to give different results. MIXED with the RANDOM statement fits a subject-specific model (for making inferences or predictions on individual subjects) while GENMOD fits a population-averaged model (for making overall, population inferences/predictions) and these are also expected to differ.

SteveDenham · Posted 02-17-2022 08:39 AM

In addition to what @StatDave said, note that your response variable is binary, and none of your PROC codes take that into account. The REML/ML methods in MIXED are conditional on the random effects, while the GEE methods in GENMOD are marginal estimates, so I am not surprised that the results are different, and both could be biased (but by different amounts) due to the assumption of normal errors, depending on the sample size.

SteveDenham

fdehkord · Posted 02-17-2022 01:39 PM

Thank you for your explanation. Now the question is which method should be
used if the goal is to evaluate the relationship between HBCO and the
outcome?
Thanks.

StatDave · Posted 02-17-2022 03:31 PM

That sounds like a population inference rather than predicting at the individual level. If so, then the GEE model is probably what you want. But note that PROC GEE is the newer procedure for fitting this model and is recommended over GENMOD for that purpose. It uses essentially identical syntax. Also, it is not clear whether the response variable (Y in your initial post) is continuous or binary. If it is binary then you should specify the DIST=BIN in the MODEL statement. In any case, you should use the DIST= option to specify the appropriate distribution for your response variable.

Mixed model with random and repeated statement