Hello,
I'm having trouble figuring out how to delete complete rows from a dataset when the first prescription date for the patient for more than one drug is the same. Here is an example dataset:
data WORK.CLASS(label='Prescription Data'); infile datalines dsd truncover; input Patient_ID $2 Prescription_Date:MMDDYY8. Drug:$1 ; Datalines; XX 01/01/21 A XX 01/01/21 B XX 03/15/21 A XX 04/15/21 C YY 01/01/21 A YY 02/10/21 B YY 03/15/21 A YY 04/15/21 C ;;;;
The table looks something like this:
Patient_ID | Prescription Date | Drug |
XX |
01/01/21 | A |
XX | 01/01/21 | B |
XX | 03/15/21 | B |
XX | 04/15/21 | B |
YY | 01/01/21 | B |
YY | 02/10/21 | B |
YY | 04/15/21 | A |
YY | 04/15/21 | B |
Now for patient XX - they've been prescribed drugs A and B on the exact same date the first time they've ever been prescribed a drug and hence I want all 4 rows for patient XX deleted.
However, for patient YY, there is only one 'first prescription date' which is 01/01/21. They do have a subsequent prescription for drug A and B on 04/15/21. I don't want any rows for patient YY deleted because on their 'first prescription date', they only had one drug prescribed.
Can someone help?
Probably a much easier way but it's late, my suggestion would be a double DoW loop but they can be complicated to code or understand. You may also want to just use multiple steps instead if that's easier.
1. Find the number unique of drugs per day
2. Identify the first day and if it has more than 2 drugs write that ID to a table
3. Filter based on ID's in table2
proc sql;
create table drugsPerDay as
select patient_id, prescription_date, count(distinct drug) as num_drugs
from have
group by patient_id, prescription_date;
quit;
data singleDrugs;
set drugsPerDay;
by patient_ID prescription_date;
if first.patient_id and num_drugs = 1;
run;
data want_data;
merge have (in=t1) singleDrugs (in=t2);
by patient_id;
if t2;
run;
Or squishing it all together in SQL (SQL is the easiest method to get distinct counts)
proc sql;
create table want as
select * from have where patient_id NOT in (
select patient_id from(
select patient_id, prescription_date, count(distinct drug) as num_drugs
from have
group by patient_id, prescription_date)
where num_drugs > 1
group by patient_id
having Prescription_Date=min(Prescription_Date));
quit;
@alaxman wrote:
Hello,
I'm having trouble figuring out how to delete complete rows from a dataset when the first prescription date for the patient for more than one drug is the same. Here is an example dataset:
data WORK.CLASS(label='Prescription Data'); infile datalines dsd truncover; input Patient_ID $2 Prescription_Date:MMDDYY8. Drug:$1 ; Datalines; XX 01/01/21 A XX 01/01/21 B XX 03/15/21 A XX 04/15/21 C YY 01/01/21 A YY 02/10/21 B YY 03/15/21 A YY 04/15/21 C ;;;;The table looks something like this:
Patient_ID Prescription Date Drug XX
01/01/21 A XX 01/01/21 B XX 03/15/21 B XX 04/15/21 B YY 01/01/21 B YY 02/10/21 B YY 04/15/21 A YY 04/15/21 B
Now for patient XX - they've been prescribed drugs A and B on the exact same date the first time they've ever been prescribed a drug and hence I want all 4 rows for patient XX deleted.
However, for patient YY, there is only one 'first prescription date' which is 01/01/21. They do have a subsequent prescription for drug A and B on 04/15/21. I don't want any rows for patient YY deleted because on their 'first prescription date', they only had one drug prescribed.
Can someone help?
1. The code you provided does not run. Please always vet it by pasting back to SAS and running it.
2. Here is step by step, easy to follow (if not at all efficient) method:
proc sql;
create table MIN as
select PATIENT_ID, min(PRESCRIPTION_DATE) as MIN
from CLASS
group by 1 ;
create table COUNT as
select CLASS.PATIENT_ID, count(distinct DRUG) as COUNT
from CLASS, MIN
where CLASS.PATIENT_ID=MIN.PATIENT_ID and PRESCRIPTION_DATE=MIN
group by 1 ;
delete from CLASS where PATIENT_ID in (select PATIENT_ID from COUNT where COUNT>1) ;
quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.