BookmarkSubscribeRSS Feed
IndiAnna
Calcite | Level 5

Hi Everyone,

I could use some help getting PROC GLIMMIX (or another SAS procedure, if more appropriate) to model some correlated binary data.

This is patient data, where the outcome is "yes" or "no" (did the patient have the event in question).  Other variables of interest are baseline measures (height, weight, etc.), physician conducting procedure, and hospital where physician conducted procedure.  Some patients have more than one observation, while others have only one.  So, I want to account for correlation within the same patient, within the same physician, and within the same hospital.

Here is the code I am using:

proc glimmix data=events;

     class patient passes physician priors site related;

     model related = priors passes postwt bmi/dist=binary link=logit ddfm=bw solution;

     random _residual_/subject=patient solution; /*patient is repeated measure*/

     random physician/ solution; /*physician and site are random effects*/

     random site/solution;

run;

When I use this code, depending on which covariates I use, I sometimes get no estimates (solutions for random effects) for site and/or for physician.  This makes me wonder how valid the model is.  If I just use physician or site as a fixed effect, the model does not converge.  Also, this doesn't seem quite like the correct way to model the data.

I would like to know whether certain physicians are associated with events of interest and also whether certain hospitals are associated with events of interest.  This is in addition to the relationship between covariates like height, weight, and age and events of interest.

Is there a more appropriate way for me to model this data?  I am not sure that the syntax I am using is actually achieving the desired model.

Many thanks in advance!

2 REPLIES 2
JacobSimonsen
Barite | Level 11

Maybe, the problem is that you have a site or a physician where all outcomes are equal. Then, you can not estimate the effect for that site/physician if you have site/physician as a fixed effect. But, you can estimate it when you use random effect, because you then make the assumption that the site-effect /physician-effect are normal distributed with mean zero, therefore the effect can not be too extreme. And that makes the model converge when you use random effect instead of fixed effect.

SteveDenham
Jade | Level 19

Do you have an indexing variable for the repeated measures on the patient?  Also, you have ''related" as the response variable, but included in the CLASS statement.  I would recommend removing it from the CLASS statement I have a couple of other changes that are recommended for convergence problems (restatements of the RANDOM statement), and I recommend using method=laplace. If so, a crude approach might be the following:

proc glimmix data=events method=laplace;

     class patient passes physician priors site visitindex;

     model related = priors passes postwt bmi/dist=binary link=logit ddfm=bw solution;

     random visitindex/subject=patient type=cs solution; /*patient is repeated measure*/

     random intercept/subject=physician solution; /*physician and site are random effects*/

     random intercept/subject=site solution;

run;

Other questions:

Are physicians located within site, or might a given physician be at one or more sites?

How many patients are included?  If this number is substantially greater than the number of physicians and sites, then the G-side implementation above (conditional approach) could be replaced with an R-side parameterization (marginal approach), such as:

proc glimmix data=events;

     class patient passes physician priors site siteindex;

     model related = priors passes postwt bmi/dist=binary link=logit ddfm=bw solution;

     random siteindex/residual subject=patient type=cs solution; /*patient is repeated measure*/

     random intercept/subject=physician solution; /*physician and site are random effects*/ 

     random intercept/subject=site solution;

run;

Good luck.

Steve Denham

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 3036 views
  • 0 likes
  • 3 in conversation