Hi,
I have a data set that contains Patient IDs, Drugs, and date of the prescription fill. I am trying to identify the number of individual patients that were on a drug in a given year. A patients typically have multiple records within a year, as each record representation a prescription fill date. The data structure looks like this:
| Patid | DrugName | Year |
| 100 | Drug 1 | 2013 |
| 200 | Drug 1 | 2013 |
| 200 | Drug 1 | 2013 |
| 200 | Drug 1 | 2013 |
| 200 | Drug 1 | 2013 |
| 300 | Drug 1 | 2013 |
| 100 | Drug 1 | 2014 |
| 300 | Drug 2 | 2013 |
| 100 | Drug 2 | 2015 |
| 400 | Drug 2 | 2015 |
| 400 | Drug 2 | 2015 |
| 400 | Drug 2 | 2015 |
| 300 | Drug 3 | 2014 |
| 300 | Drug 3 | 2015 |
| 100 | Drug 3 | 2015 |
In my head, the output should should look something like this:
| Drug | year | #Patients |
| Drug 1 | 2013 | 3 |
| Drug 1 | 2014 | 1 |
| Drug 2 | 2013 | 1 |
| Drug 2 | 2015 | 1 |
| Drug 3 | 2014 | 1 |
| Drug 3 | 2015 | 2 |
I have tried proc summary, but that just gives me a frequency of the number of records, by drug & year. I am thinking proc sql using count/distinct would work, but not sure how to group by drugname & year, as I am not very good at this...
Any assistance is greatly appreciated.
Thanks!
Chris
You can do that but I'd consider a different method because you are likely to want multiple stats.
data have2;
set have;
by id drug;
First_Drug=0;
if first.drug then First_Drug=1;
run;
proc means data=have2 sum;
class year drug;
var first_drug;
output out=want sum(first_drug) = num_patients;
run;
Thanks. This worked well, and matches up with another step in my analysis.
-Chris
PROC FREQ is pretty good at counting:
proc freq data=have;
tables DrugName*PatID*Year / noprint out=summarized;
run;
proc freq data=summarized;
tables DrugName*Year / list;
run;
The frequency column in the final PROC FREQ will be the patient count. Also note, the second PROC FREQ could create an output data set if needed.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.