Hi SAS Community,
I am trying to create a new data set using PROC SQL that only includes patients with 3 different prescriptions of medication A. However, I also need to include the other medication data from these selected patients (i.e., those that received 3 different prescriptions of medication A).
For example, this is my “Have” data set and the data set I “Want.”
Have:
Patient_ID | Med_name | Start_date |
001 | A | 01JUN2018 |
001 | A | 15JUN2018 |
001 | A | 01JUL2018 |
001 | B | 01JUN2018 |
001 | C | 01JUN2018 |
002 | A | 05JAN2018 |
002 | D | 15FEB2018 |
003 | A | 20OCT2018 |
003 | A | 15NOV2018 |
003 | A | 15DEC2018 |
003 | E | 15DEC2018 |
Want:
Patient_ID | Med_name | Start_date |
001 | A | 01JUN2018 |
001 | A | 15JUN2018 |
001 | A | 01JUL2018 |
001 | B | 01JUN2018 |
001 | C | 01JUN2018 |
003 | A | 20OCT2018 |
003 | A | 15NOV2018 |
003 | A | 15DEC2018 |
003 | E | 15DEC2018 |
I’m using the PROC SQL code below; however, the problem is that my code only produces a table that does not include the other medication data for each participant (see below). Can someone please help me modify this code so that all medication data are included for each selected patient, like what is show above for table "Want"?
proc sql ;
create table want as select *
from have
where Med_name like 'A%'
group by patient_id
having count(distinct start_date)>=3;
quit;
Data set that I am getting (do not want):
Patient_ID | Med_name | Start_date |
001 | A | 01JUN2018 |
001 | A | 15JUN2018 |
001 | A | 01JUL2018 |
003 | A | 20OCT2018 |
003 | A | 15NOV2018 |
003 | A | 15DEC2018 |
Your requirement would translate to:
proc sql;
create table want as
select *
from have
where patient_id in (
select patient_id
from have
where med_name = "A"
group by patient_id
having count(*) >= 3);
quit;
It is not clear if you want three or more "A" prescriptions or exactly three prescriptions. If the later, replace >= with = in the above.
data have;
infile cards expandtabs;
input Patient_ID Med_name $ Start_date :date9.;
format Start_date date9.;
cards;
1 A 1-Jun-18
1 A 15-Jun-18
1 A 1-Jul-18
1 B 1-Jun-18
1 C 1-Jun-18
2 A 5-Jan-18
2 D 15-Feb-18
3 A 20-Oct-18
3 A 15-Nov-18
3 A 15-Dec-18
3 E 15-Dec-18
;
proc sql;
create table want(drop=s) as
select *
from
(select * ,(count(distinct Start_date)=3 and Med_name='A') as s
from have
group by Patient_ID,Med_name)
group by patient_id
having sum(s)>=3;
quit;
Your requirement would translate to:
proc sql;
create table want as
select *
from have
where patient_id in (
select patient_id
from have
where med_name = "A"
group by patient_id
having count(*) >= 3);
quit;
It is not clear if you want three or more "A" prescriptions or exactly three prescriptions. If the later, replace >= with = in the above.
Thank you both! Both suggestions worked great!
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.