06-16-2016 11:11 AM
I have a data set that contains Patient IDs, Drugs, and date of the prescription fill. I am trying to identify the number of individual patients that were on a drug in a given year. A patients typically have multiple records within a year, as each record representation a prescription fill date. The data structure looks like this:
In my head, the output should should look something like this:
I have tried proc summary, but that just gives me a frequency of the number of records, by drug & year. I am thinking proc sql using count/distinct would work, but not sure how to group by drugname & year, as I am not very good at this...
Any assistance is greatly appreciated.
06-16-2016 11:51 AM
You can do that but I'd consider a different method because you are likely to want multiple stats.
data have2; set have; by id drug; First_Drug=0; if first.drug then First_Drug=1; run;
proc means data=have2 sum;
class year drug;
output out=want sum(first_drug) = num_patients;
06-16-2016 12:10 PM
PROC FREQ is pretty good at counting:
proc freq data=have;
tables DrugName*PatID*Year / noprint out=summarized;
proc freq data=summarized;
tables DrugName*Year / list;
The frequency column in the final PROC FREQ will be the patient count. Also note, the second PROC FREQ could create an output data set if needed.