Hi SAS Community,
I am trying to create a new data set using PROC SQL that only includes patients with 3 different prescriptions of medication A. However, I also need to include the other medication data from these selected patients (i.e., those that received 3 different prescriptions of medication A).
For example, this is my “Have” data set and the data set I “Want.”
Have:
| Patient_ID | Med_name | Start_date | 
| 001 | A | 01JUN2018 | 
| 001 | A | 15JUN2018 | 
| 001 | A | 01JUL2018 | 
| 001 | B | 01JUN2018 | 
| 001 | C | 01JUN2018 | 
| 002 | A | 05JAN2018 | 
| 002 | D | 15FEB2018 | 
| 003 | A | 20OCT2018 | 
| 003 | A | 15NOV2018 | 
| 003 | A | 15DEC2018 | 
| 003 | E | 15DEC2018 | 
Want:
| Patient_ID | Med_name | Start_date | 
| 001 | A | 01JUN2018 | 
| 001 | A | 15JUN2018 | 
| 001 | A | 01JUL2018 | 
| 001 | B | 01JUN2018 | 
| 001 | C | 01JUN2018 | 
| 003 | A | 20OCT2018 | 
| 003 | A | 15NOV2018 | 
| 003 | A | 15DEC2018 | 
| 003 | E | 15DEC2018 | 
I’m using the PROC SQL code below; however, the problem is that my code only produces a table that does not include the other medication data for each participant (see below). Can someone please help me modify this code so that all medication data are included for each selected patient, like what is show above for table "Want"?
proc sql ;
		create table want as select * 
		from have
		where Med_name like 'A%' 
		group by patient_id
		having count(distinct start_date)>=3;
		quit;Data set that I am getting (do not want):
| Patient_ID | Med_name | Start_date | 
| 001 | A | 01JUN2018 | 
| 001 | A | 15JUN2018 | 
| 001 | A | 01JUL2018 | 
| 003 | A | 20OCT2018 | 
| 003 | A | 15NOV2018 | 
| 003 | A | 15DEC2018 | 
Your requirement would translate to:
proc sql;
create table want as
select * 
from have
where patient_id in (
    select patient_id 
    from have 
    where med_name = "A"
    group by patient_id
    having count(*) >= 3);
quit;It is not clear if you want three or more "A" prescriptions or exactly three prescriptions. If the later, replace >= with = in the above.
data have;
infile cards expandtabs;
input Patient_ID	Med_name $	Start_date :date9.;
format Start_date date9.;
cards;
1	A	1-Jun-18
1	A	15-Jun-18
1	A	1-Jul-18
1	B	1-Jun-18
1	C	1-Jun-18
2	A	5-Jan-18
2	D	15-Feb-18
3	A	20-Oct-18
3	A	15-Nov-18
3	A	15-Dec-18
3	E	15-Dec-18
;
proc sql;
create table want(drop=s) as
select *
from 
(select * ,(count(distinct Start_date)=3 and Med_name='A') as s
from have
group by  Patient_ID,Med_name)
group by patient_id
having sum(s)>=3; 
quit;
Your requirement would translate to:
proc sql;
create table want as
select * 
from have
where patient_id in (
    select patient_id 
    from have 
    where med_name = "A"
    group by patient_id
    having count(*) >= 3);
quit;It is not clear if you want three or more "A" prescriptions or exactly three prescriptions. If the later, replace >= with = in the above.
Thank you both! Both suggestions worked great!
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
