Statistical Procedures

adworsky · Posted 07-19-2024 04:20 PM

Hi.

I have about 60,000 records for children in 12 states. Some children have multiple records. Some of just one record.

I have a binary dependent variable. I am trying to estimate a model that takes into account the fact that the records for the same child are correlated and that children are nested within states. I know how to estimate models with GEE to deal with the correlation but am not sure how to deal with the nesting. The data look like this.

State	Child	X	Y
1	1	1	1
1	1	0	1
1	2	1	1
1	2	1	0
1	2	0	0
1	2	0	0
2	3	0	1
2	3	0	1
2	3	1	1
2	3	0	0
2	3	0	1
2	4	1	0
2	4	0	1
2	4	1	0
3	5	1	0
3	6	0	0
3	7	0	0
3	7	0	1
3	7	1	1
3	7	1	0

Ksharp · Posted 07-20-2024 03:03 AM

You could combine these two variable into ONE variable(SubjectId). and feed it into PROC GEE.

SubjectID=catx('|',State,Child);

And an alternative way is : (Check GEE's doc example)
proc gee data=Resp descend;
class ID Treatment Center Sex Baseline;
model Outcome=Treatment Center Sex Age Baseline /
dist=bin link=logit;
repeated subject=ID(Center) / corr=exch corrw;
run;

change "subject=ID(Center)" into
"subject=Child(State)"

View solution in original post

Ksharp · Posted 07-20-2024 03:03 AM

You could combine these two variable into ONE variable(SubjectId). and feed it into PROC GEE.

SubjectID=catx('|',State,Child);

And an alternative way is : (Check GEE's doc example)
proc gee data=Resp descend;
class ID Treatment Center Sex Baseline;
model Outcome=Treatment Center Sex Age Baseline /
dist=bin link=logit;
repeated subject=ID(Center) / corr=exch corrw;
run;

change "subject=ID(Center)" into
"subject=Child(State)"

adworsky · Posted 11-27-2024 03:59 PM

Thank you.

StatDave · Posted 07-20-2024 12:34 PM

First, PROC GEE is a newer procedure specifically for fitting the GEE model and is the recommended procedure when fitting that model. Then see this note on specifying the TYPE= correlation structure. As mentioned there, the GEE method is robust to misspecifying the correlation structure, so if you believe that all measurements within STATE are correlated (in varying degrees), you could simply specify SUBJECT=STATE and TYPE=EXCH. That said, the Alternating Logistic Regressions (ALR) method available in PROC GEE (and GENMOD), in which you model the log odds ratio among pairs of measurements, allows nested structures. Using the ALR example in the GENMOD documentation, the following adds a subcluster variable to each cluster with each cluster of size 4 now containing 2 subclusters of size 2. The ALR model allows for two log odds ratios to be estimated - one for within the subclusters and one for between the subclusters.

data resp; set resp;
  if visit in (1,2) then subclusID=1; else subclusID=2;
  run;
proc gee data=resp;
   class id treatment(ref="P") center(ref="1") sex(ref="M")
      baseline(ref="0") subclusID;
   model outcome(event='1')=treatment center sex age baseline / dist=bin;
   repeated  subject=id(center) / logor=nest1 subcluster=subclusID;
   run;

Another alternative approach might be to fit a GEE-like marginal model using PROC GLIMMIX using an appropriate structure.

Statistical Procedures

Design with repeated measures and nesting

Re: Design with repeated measures and nesting

Re: Design with repeated measures and nesting

Re: Design with repeated measures and nesting

Re: Design with repeated measures and nesting

Follow Us

What is...

Statistical Procedures

Design with repeated measures and nesting

Re: Design with repeated measures and nesting

Re: Design with repeated measures and nesting

Re: Design with repeated measures and nesting

Re: Design with repeated measures and nesting

Special offer for SAS Communities members

Follow Us

What is...