Re: Repeated Measures: PROC GLIMMIX vs. PROC GENMOD vs. PROC GEE

varatt90 · Posted 10-25-2022 11:38 PM

Hi,

I am trying to figure out which procedure (PROC GLIMMIX, PROC GENMOD, PROC GEE) best suits what I am trying to model.

My model has:

nominal outcome variable (e.g., Drug A, Drug B, Drug C)
categorical and continuous predictors
clustering (e.g., hospitalID)
correlated data
study design: repeated cross-sectional

So far this is what I have learned after reading through some SAS documents:

	PROC GLIMMIX	PROC GEE	PROC GENMOD
Fixed or Random	Random	Fixed	Fixed
Outcome	Categorial	Binary/Nominal	Binary/Ordinal
Level inference is made	Subject	Population	Population
Handles Student level Clustering	Yes	Yes	Yes
Handles Correlated data	No	Yes	Yes
Missing Data	???	MCAR Complete	MCAR Complete

Question 1: Does the table above accurately describe the differences/similarities?

Question 2: If I'm interested in population-based averages and would like to present two models (Model 1: Drug B vs. Drug A ; Model 2: Drug C vs. Drug A), does it make sense to use PROC GEE with a binomial distribution, logit link function and workable log odds ratio correlation structure?

Thank you!

Ksharp · Posted 10-26-2022 08:26 AM

Calling @StatDave @SteveDenham @lvm

SteveDenham · Posted 10-26-2022 09:59 AM

I think you have it down for GEE and GENMOD. GLIMMIX allows for a variety of correlated data, including multilevel effects. It also deals with missingness up to missing at random (it does not eliminate records that have missing values for model factors) Use the first example in the PROC GEE documentation for a good comparison of marginal and random effect models. I have two concerns. The first is that I am not clear on the use of Drug as a response variable when your research question talks about comparing levels of Drug. I would consider Drug as a fixed effect to be included in the model, and the response to be something measured on the patient (cured-not cured, for example). The second is that I cannot tell if there are two levels of clustering here--patient level and hospital level. If each patient is measured one time then there are no patient level clusters - the patient level effect is the "residual" or scale estimate. Since you mention that the design is repeated this would be treated as a patient/student level cluster. However, hospital needs to be specified as either fixed (inference space is then repeated studies at the specified hospitals) or random (inference space is repeated studies at the greater population of hospitals, of which the ones in the data represent a "random" sample).

From this, I can see a PROC GEE approach for the narrow inference space of the sample of hospitals, or a PROC GLIMMIX approach for the broad inference space of "all hospitals".

SteveDenham

varatt90 · Posted 10-26-2022 11:07 AM

Thank you for your response!

"GLIMMIX allows for a variety of correlated data, including multilevel effects. It also deals with missingness up to missing at random (it does not eliminate records that have missing values for model factors)"

So, I can account for the correlation of subjects within the same cluster by indicating the cluster variable (hospital/school) a random effect. In other words, I can capture the variability among subjects.

Use the first example in the PROC GEE documentation for a good comparison of marginal and random effect models. I have two concerns. The first is that I am not clear on the use of Drug as a response variable when your research question talks about comparing levels of Drug. I would consider Drug as a fixed effect to be included in the model, and the response to be something measured on the patient (cured-not cured, for example).

Yes, I agree! Let's go with your example.

The second is that I cannot tell if there are two levels of clustering here--patient level and hospital level. If each patient is measured one time then there are no patient level clusters - the patient level effect is the "residual" or scale estimate. Since you mention that the design is repeated this would be treated as a patient/student level cluster.

The data is hierarchal. Patients are measured each year on the same variables (e.g., alcohol consumption, depression). So, I have patient data and then patients are recruited from different hospitals. The clustering is at the hospital level. So my repeated statement would be "repeated subject = HospitalID".

However, hospital needs to be specified as either fixed (inference space is then repeated studies at the specified hospitals) or random (inference space is repeated studies at the greater population of hospitals, of which the ones in the data represent a "random" sample). From this, I can see a PROC GEE approach for the narrow inference space of the sample of hospitals, or a PROC GLIMMIX approach for the broad inference space of "all hospitals".

So, I can use either PROC GEE or PROC GLIMMIX depending on whether I decide to state "hospitalID" as a random effect or fixed effect.

I'm not quite grasping what the difference between "fixed - inference space repeated at specified hospitals" and "random - inference space is repeated studies at the greater population of hospitals". It seems like the variability that is caused by patients being recruited from different hospitals will be accounted for in each procedure

Thank you for taking the time to help!

SteveDenham · Posted 11-01-2022 09:31 AM

Fixed effect in GLIMMIX: The parameters and their error estimates are applicable to a narrow inference space. In this case at the hospital level, that space is the set of all possible repetitions of the trial at only the hospitals included in the analysis.

Random effect in GLIMMIX: The parameters and their error estimates are applicable to a broad inference space. In this case at the hospital level, that space is the set of all possible repetitions of the trial at any sample from the whole set of hospitals

In both cases, the patients represent a random sample from the full set of possible patients.

SteveDenham

Repeated Measures: PROC GLIMMIX vs. PROC GENMOD vs. PROC GEE