12-15-2015 08:31 AM
I have data which looks like this:
And the data is coming from a trial testing a new treatment for an eye condition (e.g, Cataract).
Every subject contribute 2 observations, from his two eyes, and therefore the data is correlated. I have 20 subjects in each group, thus I have 40 observations in each group.
I am trying to model the datat using various methods, and to learn from it on the differences between GLMM and GEE in general and in SAS in particular.
I tried the following:
proc glimmix data = Example1; class Group ID; model outcome(event='1') = Group / dist = binary solution; random int / sub = ID; lsmeans Group / ilink; run; proc glimmix data = Example1 ; class ID Group Eye; model outcome(event='1') = Group / dist=binary solution; random Eye / residual type=unr subject=ID ; lsmeans Group / ilink; run; proc genmod data=Example1 descending; class Group ID; model outcome = Group / dist=bin; repeated subject=ID / type=cs covb corrw; run;
Each model gave somehow different results (this is a simulation exercise, so I do know the REAL proportions (of success and failure, and also the correlation within subject).
I have multiple questions:
If you can add any info and details which will help me understand what and how to use, and when, it will be very appreciated. I do know the basic difference between GEE and GLMM (i.e. marginal vs. conditional).
Thank you in advance !
12-15-2015 10:40 AM
I don't know much about GEE models, but you should know there is a third procedure that you can use for this (PROC GEE, the GEE Procedure).
The GEE procedure was introduced in SAS/STAT 13.2.
Take care if you have a SAS/STAT release prior to 13.2 because then your GEE procedure is experimental!
SAS/STAT(R) 14.1 User's Guide
The GEE Procedure
See also this paper:
Weighted Methods for Analyzing Missing Data with the GEE Procedure
Guixian Lin and Robert N. Rodriguez, SAS Institute Inc.
12-16-2015 01:49 AM
Thank you for your quick reply. Unfortunatelly I do not have access to PROC GEE (yet), so I still reqire an answer relating to the procedures I wrote. Saying that, I am looking forward to try the new procedure in the future, I think that GEE deserves a procedure.
12-17-2015 09:29 AM
A lot of your questions would require long answers to properly explain. I highly recommend you get the great textbook by Walt Stroup (Generalized Linear Mixed Models). You would learn a great deal and learn how to address all your questions.
12-20-2015 07:37 AM
I do have the book actually, not an easy one, but a very good one. If I may, I will try to focus to a couple of smaller questions, maybe you can guide me a little bit.
In the clinical trials frame, is there any logic telling us when should we use each model ? For example, is it correct that in prospective randomized trials, the results of both models should be roughly the same ? Is it like this in all cases ? And my second question, if I have a model with a binary outcome, one factring variable (treatment vs. control) and another factor covariate (with 4 levels), and I remove one level from the covariate, how will it affect each model ?
thank you !
12-20-2015 05:04 PM
Two other references that may be easier for you:
book by Ed Gbur and co-authos (Analysis of Generalized Linear Mixed Models)
article in the publication Agronomy Journal by Walter Stroup (published in 2014, I believe) on GLMMs.
These are for the agricultural sciences, but should be much clearer for you.
The conditional model (one without a scale parameter, where overdispersion is handled by adding a random effect for the lowest level in the hierarchy) is properly targeting the probability of a trait for the subject (lowest level in the hierarchy) as a function of the predictor variables. The GEE model (or, in general, models that handle overdispersion by rescaling all the SE) is targeting the marginal distribution -- mean for the observations over all the subjects. The variance among subjects, in addition to the probability parameter, determine the mean proportion. Only for normal distributions are these two the same thing. With binary or binomial data, when the probability is less than 1/2, the marginal mean proportion for the observations is larger than the conditional probability. Stroup makes the argument that researchers usually want the probability for the conditional model, although there are exceptions. Since the target of the inference is not the same, one does not get the same result for the two approaches. They can be similar, but they are not expected to be overly similar. It depends on the variances/covariances. There is a great section in chapter 3 of the Stroup book about this: conditional vs marginal inference. Perhaps the most important conceptual part of the book.
Removing a factor will certainly change the model fit and other parameters.