BookmarkSubscribeRSS Feed
billi_billi
Calcite | Level 5

Hello 

First time working with clustered data. I am using PROC GEE and accounting for the clustering but the variables that I mentioned in the REPEATED SUBJECT and WITHIN statement have some missing values, is there any way to overcome this? Because of these missing values I am not getting any results, just get an error saying 'A missing value was detected in the SUBJECT, WITHINSUBJECT, or LOGORVAR effect. All values of variables in these effects must be non-missing.'

Below is my code: 

 

PROC GEE DATA=DATA DESC;
CLASS ID AREA DOCTOR_VISIT ;
MODEL DOCTOR_VISIT=AGE/ DIST=BIN LINK=LOGIT;
REPEATED SUBJECT=ID/ WITHIN=AREA CORR=CS;
RUN;

 

 

7 REPLIES 7
StatDave
SAS Super FREQ
Those values have to be nonmissing so that the data for an observation is properly associated with a cluster (subject) and properly positioned within the cluster.
billi_billi
Calcite | Level 5

@StatDave Thank you for the reply. Is there any option that I can use to exclude the missing data?

StatDave
SAS Super FREQ

yes, just include a WHERE statement like:
where doctor_visit ne . and area ne .;

billi_billi
Calcite | Level 5

@StatDave Thank you this worked. But then I got an error 'The within effect should be unique'. I am assuming this is because I have duplicates within, so I changed my code from this:

 

PROC GEE DATA=ED_DATA DESC;
WHERE ID NE . AND area NE . ;
CLASS ID study_census_tract3 doctor_visit ;
MODEL doctor_visit=age/ DIST=BIN LINK=LOGIT;
REPEATED SUBJECT=ID/ WITHIN=area CORR=IND;
RUN;

 

TO THIS CODE:

 

PROC GEE DATA=ED_DATA DESC;
WHERE ID NE . AND area NE . ;
CLASS ID study_census_tract3 doctor_visit ;
MODEL doctor_visit=age/ DIST=BIN LINK=LOGIT;
REPEATED SUBJECT=ID*area/  CORR=IND;
RUN;

 

Just wanted to check if this is correct or there is some other way to overcome the above error. Also is there any way to gets odds ratio?

 

Sorry, to bombard you with so many questions. But I really appreciate your help. Thank you very much

StatDave
SAS Super FREQ

You might not be understanding the purpose of the WITHIN= variable - it is used to order the observations within each cluster. This is necessary when the correlation structure requires ordering such as with the TYPE=AR structure. A separate issue is whether your SUBJECT= variable has unique values for every cluster. If that variable repeats values in the data set and not all of them are in the same cluster, then you might need to involve a second variable as discussed in this note

billi_billi
Calcite | Level 5

@StatDave Thank you very much for this article. This really helped. I am trying to find if there is any documentation to overcome the error I got for WITHINSUBJECT=variable. 

If you happen to know any documentation related to this could you please direct me to that? If not thank you very much for all the replies and your information.

StatDave
SAS Super FREQ
As I said, ordering within clusters is important when the specified correlation structure takes that into account. The exchangeable structure (TYPE=EXCH or CS) is not one such. So, you do not need to specify the WITHIN= option.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 1519 views
  • 4 likes
  • 2 in conversation