Hi
I am working on an alternative logistic regression model with repeated measures
The drug is the unit of analysis, and there are TWO levels of clustering—within patient, and patient within MD
My dataset looks like this (My dataset contains approximately 15000 rows)
data drugs;
input patient MD Gender Age drug $ indication outcome;
datalines;
1 1 2 54 a 1 1
1 1 2 54 b 1 1
1 1 2 54 c 0 0
2 4 1 41 a 1 0
2 4 1 41 c 1 0
2 4 1 41 e 1 1
3 1 1 24 h 0 0
4 5 2 29 c 1 0
5 1 2 72 a 1 1
5 1 2 72 a 0 1
6 2 2 72 i 0 1
6 2 2 72 b 0 0
7 1 1 36 a 0 0
8 3 1 25 a 0 1
;
PROC GENMOD DATA= drugs;
CLASS MD PATIENT gender(ref="1") indication(ref="1");
MODEL outcome (event='1')= gender indication age / DIST=BIN ;
REPEATED SUBJECT= PATIENT / TYPE=EXCH ;
RUN ;
What is wrong with my model?
Thank you very much for your help
Can you describe in a bit more detail what the research question the analysis is supposed to answer?
Hello,
Thank you for your response.
I am working on evaluating whether certain drugs are prescribed correctly, which will be my outcome measure. If the prescription complies with the recommended dosage, the outcome will be assigned a value of 1; if not, it will be assigned a value of 0.
Please note that a patient may be prescribed more than one medication, and around 15000 patients are treated by approximately 150 physicians.
According to the example of GEE or GENMOD, you need option logor= to do ALR model.
PROC GENMOD DATA= drugs; CLASS MD PATIENT gender(ref="1") indication(ref="1"); MODEL outcome (event='1')= gender indication age / DIST=BIN ; REPEATED SUBJECT= PATIENT / logor=fullclust ; RUN ;
And to build a GEE model better to use the newer PROC GEE.
PROC GEE DATA= drugs; CLASS PATIENT gender(ref="1") indication(ref="1"); MODEL outcome (event='1')= gender indication age / DIST=BIN ; REPEATED SUBJECT= PATIENT / logor=fullclust ; RUN ;
Hello,
Thank you for your response. However, in the case of GEE, why is the physician missing in the repeated subject?
Questions like this that are about statistical methods or statistical procedures will be addressed faster and get more attention if you post them in the Analytics>Statistical Procedures community.
First, note that the GEE method is robust to not specifying exactly the right clustering structure, so it is not unreasonable the use the GEE model from the code you showed in your first post.
However, if you particularly want to use an ALR model to estimate log odds ratios, and if your data consists of patients clustered within physicians and with multiple observations from patients, then you need to change your REPEATED statement options. Instead of TYPE=EXCH, which requests a GEE model, specify SUBJECT=MD (I assume that MD is your physician indicator) then specify LOGOR=NEST1 and SUBCLUSTER=PATIENT:
repeated subject=md / logor=nest1 subcluster=patient;
While LOGOR=NESTK (with SUBCLUSTER=PATIENT) or FULLCLUST are other possible structures, they are probably not feasible since you indicate that, on average, there are 100 patients per physician. These structures would probably require the estimation far too many log odds ratios, and in any case would probably not be useful.
Thank you for your time and assistance; you helped me understand!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.