Hello
First time working with clustered data. I am using PROC GEE and accounting for the clustering but the variables that I mentioned in the REPEATED SUBJECT and WITHIN statement have some missing values, is there any way to overcome this? Because of these missing values I am not getting any results, just get an error saying 'A missing value was detected in the SUBJECT, WITHINSUBJECT, or LOGORVAR effect. All values of variables in these effects must be non-missing.'
Below is my code:
PROC GEE DATA=DATA DESC;
CLASS ID AREA DOCTOR_VISIT ;
MODEL DOCTOR_VISIT=AGE/ DIST=BIN LINK=LOGIT;
REPEATED SUBJECT=ID/ WITHIN=AREA CORR=CS;
RUN;
@StatDave Thank you for the reply. Is there any option that I can use to exclude the missing data?
yes, just include a WHERE statement like:
where doctor_visit ne . and area ne .;
@StatDave Thank you this worked. But then I got an error 'The within effect should be unique'. I am assuming this is because I have duplicates within, so I changed my code from this:
PROC GEE DATA=ED_DATA DESC;
WHERE ID NE . AND area NE . ;
CLASS ID study_census_tract3 doctor_visit ;
MODEL doctor_visit=age/ DIST=BIN LINK=LOGIT;
REPEATED SUBJECT=ID/ WITHIN=area CORR=IND;
RUN;
TO THIS CODE:
PROC GEE DATA=ED_DATA DESC;
WHERE ID NE . AND area NE . ;
CLASS ID study_census_tract3 doctor_visit ;
MODEL doctor_visit=age/ DIST=BIN LINK=LOGIT;
REPEATED SUBJECT=ID*area/ CORR=IND;
RUN;
Just wanted to check if this is correct or there is some other way to overcome the above error. Also is there any way to gets odds ratio?
Sorry, to bombard you with so many questions. But I really appreciate your help. Thank you very much
You might not be understanding the purpose of the WITHIN= variable - it is used to order the observations within each cluster. This is necessary when the correlation structure requires ordering such as with the TYPE=AR structure. A separate issue is whether your SUBJECT= variable has unique values for every cluster. If that variable repeats values in the data set and not all of them are in the same cluster, then you might need to involve a second variable as discussed in this note.
@StatDave Thank you very much for this article. This really helped. I am trying to find if there is any documentation to overcome the error I got for WITHINSUBJECT=variable.
If you happen to know any documentation related to this could you please direct me to that? If not thank you very much for all the replies and your information.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.