BookmarkSubscribeRSS Feed
varatt90
Obsidian | Level 7

Hi, 

 

I am trying to figure out which procedure (PROC GLIMMIX, PROC GENMOD, PROC GEE) best suits what I am trying to model. 

 

My model has:

  • nominal outcome variable (e.g., Drug A, Drug B, Drug C)
  • categorical and continuous predictors
  • clustering (e.g., hospitalID)
  • correlated data
  • study design: repeated cross-sectional

 

So far this is what I have learned after reading through some SAS documents:

 

 

PROC GLIMMIX

PROC GEE

PROC GENMOD

Fixed or Random

Random

Fixed

Fixed

Outcome

Categorial

Binary/Nominal

Binary/Ordinal

Level inference is made

Subject 

Population

Population 

Handles Student level Clustering

Yes

Yes

Yes

Handles Correlated data

No

Yes

Yes

Missing Data

???

MCAR

Complete

MCAR

Complete

 

Question 1: Does the table above accurately describe the differences/similarities?

 

Question 2: If I'm interested in population-based averages and would like to present two models (Model 1: Drug B vs. Drug A ; Model 2: Drug C vs. Drug A), does it make sense to use PROC GEE with a binomial distribution, logit link function and workable log odds ratio correlation structure?

Thank you!

 

4 REPLIES 4
SteveDenham
Jade | Level 19

I think you have it down for GEE and GENMOD. GLIMMIX allows for a variety of correlated data, including multilevel effects. It also deals with missingness up to missing at random (it does not eliminate records that have missing values for model factors)  Use the first example in the PROC GEE documentation for a good comparison of marginal and random effect models. I have two concerns. The first is that I am not clear on the use of Drug as a response variable when your research question talks about comparing levels of Drug. I would consider Drug as a fixed effect to be included in the model, and the response to be something measured on the patient (cured-not cured, for example). The second is that I cannot tell if there are two levels of clustering here--patient level and hospital level. If each patient is measured one time then there are no patient level clusters - the patient level effect is the "residual" or scale estimate. Since you mention that the design is repeated this would be treated as a patient/student level cluster. However, hospital needs to be specified as either fixed (inference space is then repeated studies at the specified hospitals) or random (inference space is repeated studies at the greater population of hospitals, of which the ones in the data represent a "random" sample).

 

From this, I can see a PROC GEE approach for the narrow inference space of the sample of hospitals, or a PROC GLIMMIX approach for the broad inference space of "all hospitals".

 

SteveDenham

varatt90
Obsidian | Level 7

Thank you for your response!

 

"GLIMMIX allows for a variety of correlated data, including multilevel effects. It also deals with missingness up to missing at random (it does not eliminate records that have missing values for model factors)"

So, I can account for the correlation of subjects within the same cluster by indicating the cluster variable (hospital/school) a random effect. In other words, I can capture the variability among subjects.

 

Use the first example in the PROC GEE documentation for a good comparison of marginal and random effect models. I have two concerns. The first is that I am not clear on the use of Drug as a response variable when your research question talks about comparing levels of Drug. I would consider Drug as a fixed effect to be included in the model, and the response to be something measured on the patient (cured-not cured, for example).

Yes, I agree! Let's go with your example. 

 

The second is that I cannot tell if there are two levels of clustering here--patient level and hospital level. If each patient is measured one time then there are no patient level clusters - the patient level effect is the "residual" or scale estimate. Since you mention that the design is repeated this would be treated as a patient/student level cluster.

The data is hierarchal. Patients are measured each year on the same variables (e.g., alcohol consumption, depression). So, I have patient data and then patients are recruited from different hospitals. The clustering is at the hospital level. So my repeated statement would be "repeated subject = HospitalID".

 

However, hospital needs to be specified as either fixed (inference space is then repeated studies at the specified hospitals) or random (inference space is repeated studies at the greater population of hospitals, of which the ones in the data represent a "random" sample). From this, I can see a PROC GEE approach for the narrow inference space of the sample of hospitals, or a PROC GLIMMIX approach for the broad inference space of "all hospitals".

So, I can use either PROC GEE or PROC GLIMMIX depending on whether I decide to state "hospitalID" as a random effect or fixed effect.

 

I'm not quite grasping what the difference between "fixed - inference space repeated at specified hospitals" and "random - inference space is repeated studies at the greater population of hospitals". It seems like the variability that is caused by patients being recruited from different hospitals will be accounted for in each procedure  

 

Thank you for taking the time to help!

SteveDenham
Jade | Level 19

Fixed effect in GLIMMIX: The parameters and their error estimates are applicable to a narrow inference space. In this case at the hospital level, that space is the set of all possible repetitions of the trial at only the hospitals included in the analysis.

 

Random effect in GLIMMIX: The parameters and their error estimates are applicable to a broad inference space. In this case at the hospital level, that space is the set of all possible repetitions of the trial at any sample from the whole set of hospitals

 

In both cases, the patients represent a random sample from the full set of possible patients.

 

SteveDenham

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1368 views
  • 4 likes
  • 3 in conversation