Hi
I am working on a regression model with repeated measures, and my response variable is continuous.
Each pair of physician-patient-drug is unique, but the physician may have several patients who are prescribed different medications.
My dataset looks like this (My dataset contains approximately 15000 rows, approximately 15,000 patients are treated by around 150 physicians.)
data drugs;
input physician patient drug $ outcome age gender$;
datalines;
1 1 a1 83.1 55 F
1 1 b5 76.2 55 F
1 1 b7 99.9 55 F
1 4 f2 2.7 82 M
2 4 a1 40.2 82 M
2 5 a2 88.1 65 M
3 6 c8 26.1 49 F
4 2 a1 23.3 89 F
5 3 b1 92.1 67 M
5 3 b2 92.4 67 M
6 7 e4 12.1 28 M
6 8 a2 5.1 73 M
6 8 b2 3.1 73 M
7 9 a2 98.9 79 M
;
proc mixed data=drugs noclprint;
class patient physician drug gender(ref="F") ;
model outcome= gender age /solution outp=preddata ;
repeated DRUG /subject=physician(patient) type=cs ;
RUN;
I am working on evaluating whether certain drugs are prescribed correctly, which will be my outcome measure.
Do you believe the model is accurate? This is my first time working with this type of data.
Thank you for your response.
Teresa
Hi @Teresa12
The repeated statement models the correlation structure of residuals. If you believe there is random variability at the physician or patient level beyond what’s captured by the residuals, consider adding a random statement,
for example: random intercept / subject=physician(patient);
You're using compound symmetry (type=cs), which assumes equal correlation between all drug responses within a patient. This may be too restrictive. Consider testing other structures like type=un (unstructured) or type=ar(1) (autoregressive) and compare models using AIC/BIC.
If you're interested in the effect of specific drugs, you might want to include drug as a fixed effect in the model:
model outcome = gender age drug / solution;
You mentioned evaluating whether drugs are “prescribed correctly.” If this is a clinical appropriateness measure, you may need to define a benchmark or threshold for “correctness” and consider logistic regression or classification methods if the outcome becomes binary.
With that, please consider the following test example:
proc mixed data=drugs noclprint;
class patient physician drug gender(ref="F");
model outcome = gender age drug / solution outp=preddata;
random intercept / subject=physician(patient);
repeated drug / subject=physician(patient) type=un;
run;
Please consider the following SAS Documentation that can helps you:
SAS Help Center: Example 78.2 Repeated Measures
and
188-29: Repeated Measures Modeling with PROC MIXED
Hope this helps you! Thanks.
Thank you very much for your assistance.
Your response has been incredibly helpful to me!!!
Thanks a lot for your feedback and for using SAS!
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.