Hello community,
I have a big dataset with > 1 million observations. It contains information about number of prescriptions filled. I need to flag/select participants with at least two prescriptions of the same drug (either A,B or C) within 6 months 2 years prior to entrance in the study. Participants without any prescription appears with missing values.
Data have:
Patient ID ATC prescription_date entrance
1 drugA 07JAN2019 09JAN2019. <---- patient 1 fulfils criteria for flag for drug A
1 drugA 10FEB2018 09JAN2019
1 drugA 28DEC2018 09JAN2019
1 drugX 06DEC2018 06DEC2018
2 drugB 01OCT2018 07JAN2019
2 drugB 07JAN2019 07JAN2019
3 drugC 09NOV2018 08NOV2018 <----- patient 3 does not fulfil criteria for because he gets prescriptions after entrance.
3 drugC 12DEC2018 08NOV2018
3 drugC 12JAN2019 08NOV2018
Data want:
Patient ID Flag
1 1
2 1
3 0
How about this.
Data have;
input PatientID ATC $ prescription_date:date9. entrance:date9.;
format prescription_date entrance date9.;
datalines;
1 drugA 07JAN2019 09JAN2019
1 drugA 10FEB2018 09JAN2019
1 drugA 28DEC2018 09JAN2019
1 drugX 06DEC2018 06DEC2018
2 drugB 01OCT2018 07JAN2019
2 drugB 07JAN2019 07JAN2019
3 drugC 09NOV2018 08NOV2018
3 drugC 12DEC2018 08NOV2018
3 drugC 12JAN2019 08NOV2018
;
run;
proc sort data=have;
by PatientID ATC;
run;
data want;
set have;
by PatientID ATC;
retain cnt Flag;
if first.PatientID then Flag=0;
if first.ATC then cnt=0;
if prescription_date<=entrance and
intck('year',prescription_date,entrance)<=2 then cnt+1;
if last.ATC and cnt>=2 and Flag=0 then Flag=1;
if last.PatientID then output;
keep PatientID Flag;
run;
How about this.
Data have;
input PatientID ATC $ prescription_date:date9. entrance:date9.;
format prescription_date entrance date9.;
datalines;
1 drugA 07JAN2019 09JAN2019
1 drugA 10FEB2018 09JAN2019
1 drugA 28DEC2018 09JAN2019
1 drugX 06DEC2018 06DEC2018
2 drugB 01OCT2018 07JAN2019
2 drugB 07JAN2019 07JAN2019
3 drugC 09NOV2018 08NOV2018
3 drugC 12DEC2018 08NOV2018
3 drugC 12JAN2019 08NOV2018
;
run;
proc sort data=have;
by PatientID ATC;
run;
data want;
set have;
by PatientID ATC;
retain cnt Flag;
if first.PatientID then Flag=0;
if first.ATC then cnt=0;
if prescription_date<=entrance and
intck('year',prescription_date,entrance)<=2 then cnt+1;
if last.ATC and cnt>=2 and Flag=0 then Flag=1;
if last.PatientID then output;
keep PatientID Flag;
run;
Thank you japelin!
How would you write the code, if you want to combine some of the ATCs meaning that several different ATCs can count as 1?
fx C02, C03, C07, C08, C09 can all count for the same prescription?
If you are talking about multiple ATC codes per PatientID, I think the code would be something like this.
data want;
set have;
by PatientID;
retain cnt Flag;
if first.PatientID then do;
cnt=0;
Flag=0;
end;
if prescription_date<=entrance and
intck('year',prescription_date,entrance)<=2 then cnt+1;
if cnt>=2 and Flag=0 then Flag=1;
if last.PatientID then output;
keep PatientID Flag;
run;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.