Hello community,
I have a big dataset with > 1 million observations. It contains information about number of prescriptions filled. I need to flag/select participants with at least two prescriptions of the same drug (either A,B or C) within 6 months 2 years prior to entrance in the study. Participants without any prescription appears with missing values.
Data have:
Patient ID ATC prescription_date entrance
1 drugA 07JAN2019 09JAN2019. <---- patient 1 fulfils criteria for flag for drug A
1 drugA 10FEB2018 09JAN2019
1 drugA 28DEC2018 09JAN2019
1 drugX 06DEC2018 06DEC2018
2 drugB 01OCT2018 07JAN2019
2 drugB 07JAN2019 07JAN2019
3 drugC 09NOV2018 08NOV2018 <----- patient 3 does not fulfil criteria for because he gets prescriptions after entrance.
3 drugC 12DEC2018 08NOV2018
3 drugC 12JAN2019 08NOV2018
Data want:
Patient ID Flag
1 1
2 1
3 0
How about this.
Data have;
input PatientID ATC $ prescription_date:date9. entrance:date9.;
format prescription_date entrance date9.;
datalines;
1 drugA 07JAN2019 09JAN2019
1 drugA 10FEB2018 09JAN2019
1 drugA 28DEC2018 09JAN2019
1 drugX 06DEC2018 06DEC2018
2 drugB 01OCT2018 07JAN2019
2 drugB 07JAN2019 07JAN2019
3 drugC 09NOV2018 08NOV2018
3 drugC 12DEC2018 08NOV2018
3 drugC 12JAN2019 08NOV2018
;
run;
proc sort data=have;
by PatientID ATC;
run;
data want;
set have;
by PatientID ATC;
retain cnt Flag;
if first.PatientID then Flag=0;
if first.ATC then cnt=0;
if prescription_date<=entrance and
intck('year',prescription_date,entrance)<=2 then cnt+1;
if last.ATC and cnt>=2 and Flag=0 then Flag=1;
if last.PatientID then output;
keep PatientID Flag;
run;
How about this.
Data have;
input PatientID ATC $ prescription_date:date9. entrance:date9.;
format prescription_date entrance date9.;
datalines;
1 drugA 07JAN2019 09JAN2019
1 drugA 10FEB2018 09JAN2019
1 drugA 28DEC2018 09JAN2019
1 drugX 06DEC2018 06DEC2018
2 drugB 01OCT2018 07JAN2019
2 drugB 07JAN2019 07JAN2019
3 drugC 09NOV2018 08NOV2018
3 drugC 12DEC2018 08NOV2018
3 drugC 12JAN2019 08NOV2018
;
run;
proc sort data=have;
by PatientID ATC;
run;
data want;
set have;
by PatientID ATC;
retain cnt Flag;
if first.PatientID then Flag=0;
if first.ATC then cnt=0;
if prescription_date<=entrance and
intck('year',prescription_date,entrance)<=2 then cnt+1;
if last.ATC and cnt>=2 and Flag=0 then Flag=1;
if last.PatientID then output;
keep PatientID Flag;
run;
Thank you japelin!
How would you write the code, if you want to combine some of the ATCs meaning that several different ATCs can count as 1?
fx C02, C03, C07, C08, C09 can all count for the same prescription?
If you are talking about multiple ATC codes per PatientID, I think the code would be something like this.
data want;
set have;
by PatientID;
retain cnt Flag;
if first.PatientID then do;
cnt=0;
Flag=0;
end;
if prescription_date<=entrance and
intck('year',prescription_date,entrance)<=2 then cnt+1;
if cnt>=2 and Flag=0 then Flag=1;
if last.PatientID then output;
keep PatientID Flag;
run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.