Hello, I am working with a sample of health record data and I need some help writing a SAS program to answer a particular question from this data set. I need to know the total number of patients who have had at least three prescriptions of drugA or drugB or drugC (it can be one or the other), at least 21 days apart over the past 6 months (let's say six months from today 3/29/19).
Please see below for an example of how the data set is organized. Of note, the date variables are numeric variables in DATETIME25.6 format. I am a relatively new SAS user so any help with this question would be much appreciated!
Patient ID med_name start_date end_date
1 drugA 07JAN2019 09JAN2019
1 drugA 10FEB2019 11FEB2019
1 drugA 20MAR2019 20MAR2019
1 drugX 06DEC2018 06DEC2018
2 drugB 01OCT2018 02OCT2018
2 drugD 07JAN2019 09JAN2019
3 drugC 01NOV2018 08NOV2018
3 drugC 12DEC2018 15DEC2018
3 drugC 12JAN2019 17JAN2019
.....
Hi @novinosrin, I received an error message in the log indicating the input data set (meds2) is empty. Any suggestions? I have provided my code below. You'll see that I am first trying to change the date variables from DATETIME25.6 to date9. format. Forgive me for any naivety here.
data meds2;
set meds;
format start_date date9.;
informat start_date date9.;
format end_date date9.;
informat end_date date9.;
run;
proc sort data=meds2 (where=(med_name in ('drugA', 'drugB', 'drugC') and
intnx('month',today(),-6,'s')<=start_date<=today())) out=meds3;
by patientID start_date end_date;
run;
data meds4;
set meds3;
by PatientID start_date end_date;
k=lag(end_date);
if first.PatientID then c=1;
else if intck('days',k,start_date)>=21 then c+1;
if c>=3 and last.PatientID;
drop k c;
run;
May I seek some help in understanding your required output as there is no sample WANT for the sample HAVE
1. Can the drugs be any or must it be only A,B or C.
2. Can the same drug be prescribed or should it be different i.e The new start can for the same drug can start after the previous and be 21 days apart? will this qualify?
3. Over the past 6 months? does that mean current day- 6 months "sameday" occurrence period alone qualify ?Anything beyond in the past need not considered?
4 A clear expected output for the sample explaining how and why (business requirement) will help in my opinion.
5. Above all, somebody might have understood and perhaps the solution is on its way, but just in case-->
For what it's worth, have a look the numofdays column below and offer us a better intuition. Thank you
PatientID | med_name | start_date | end_date | numofdays |
---|---|---|---|---|
1 | drugA | 07JAN2019 | 09JAN2019 | . |
1 | drugA | 10FEB2019 | 11FEB2019 | 32 |
1 | drugA | 20MAR2019 | 20MAR2019 | 37 |
1 | drugX | 06DEC2018 | 06DEC2018 | -104 |
2 | drugB | 01OCT2018 | 02OCT2018 | . |
2 | drugD | 07JAN2019 | 09JAN2019 | 97 |
3 | drugC | 01NOV2018 | 08NOV2018 | . |
3 | drugC | 12DEC2018 | 15DEC2018 | 34 |
3 | drugC | 12JAN2019 | 17JAN2019 | 28 |
Hi novinosrin,
Thank you for your reply. Please see below for my answers to your questions.
1. Can the drugs be any or must it be only A,B or C.
The drugs must be only A, B, or C.
2. Can the same drug be prescribed or should it be different i.e The new start can for the same drug can start after the previous and be 21 days apart? will this qualify?
Yes. The same drug can be prescribed as long as the start date is 21 days after the previous end date.
3. Over the past 6 months? does that mean current day- 6 months "sameday" occurrence period alone qualify ?Anything beyond in the past need not considered?
Yes, anything beyond the past 6 months should not be considered. The period should be between the current day (3/29/19) - 6 months from the current date (9/29/18). Prescriptions that began/ended on the start (3/29/19) and end (9/29/18) days of this period can be included.
4 A clear expected output for the sample explaining how and why (business requirement) will help in my opinion.
Regarding an expected output, I would simply just like the total number of patients that meet this criteria. Therefore, a binary variable could be created for each patient (yes/no) indicating whether they meet the criteria and I could run proc freq to get total count. Other suggestions are welcome too.
Hi @wj2 See if this works and let me know
data have;
infile cards truncover;
input PatientID med_name $ (start_date end_date) (:date9.);
format start_date end_date date9.;
cards;
1 drugA 07JAN2019 09JAN2019
1 drugA 10FEB2019 11FEB2019
1 drugA 20MAR2019 20MAR2019
1 drugX 06DEC2018 06DEC2018
2 drugB 01OCT2018 02OCT2018
2 drugD 07JAN2019 09JAN2019
3 drugC 01NOV2018 08NOV2018
3 drugC 12DEC2018 15DEC2018
3 drugC 12JAN2019 17JAN2019
;
proc sort data=have(where=(med_name in ('drugA','drugB','drugC') and
intnx('month',today(),-6,'s')<=start_date<=today())) out=_have;
by PatientID start_date end_date;
run;
data want;
set _have;
by PatientID start_date end_date;
k=lag(end_date);
if first.PatientID then c=1;
else if intck('days',k,start_date)>=21 then c+1;
if c>=3 and last.PatientID;
drop k c;
run;
/*Now run a proc freq and all kinds of count methods you want as you wish*/
Hi @novinosrin, I received an error message in the log indicating the input data set (meds2) is empty. Any suggestions? I have provided my code below. You'll see that I am first trying to change the date variables from DATETIME25.6 to date9. format. Forgive me for any naivety here.
data meds2;
set meds;
format start_date date9.;
informat start_date date9.;
format end_date date9.;
informat end_date date9.;
run;
proc sort data=meds2 (where=(med_name in ('drugA', 'drugB', 'drugC') and
intnx('month',today(),-6,'s')<=start_date<=today())) out=meds3;
by patientID start_date end_date;
run;
data meds4;
set meds3;
by PatientID start_date end_date;
k=lag(end_date);
if first.PatientID then c=1;
else if intck('days',k,start_date)>=21 then c+1;
if c>=3 and last.PatientID;
drop k c;
run;
Hi @wj2 My understanding is your datetime(number of seconds from Jan1,1960) rather than date(number of days since jan1,1960) of course formatted as Datetime appropriately for display.
Okay no worries
Extract the date from datetime value using Datepart function like this
data meds2;
set meds;
_start=datepart(start);
_end=datepart(end);
format start_date date9.;
informat start_date date9.;
format end_date date9.;
informat end_date date9.;
run;
Use the _ prefixed extracted date variables in the logic I gave you.
--------------------------------------------------------------------------------------------------------
Alternatively, you could do revise to datepart(start_date)
intnx('month',today(),-6,'s')<=datepart(start_date)<=today())) out=meds3;
everywhere in the same code wherever the date variables are referenced but to make the process smooth, the former is a more neat process. HTH
PS
Format is just for display and do not confuse as though that can be used to extract date value from a datetime value
@novinosrin Thank you! This ended up fixing the issue.
Also, I may have misspoke, but the start date for each prescription should be at least 21 days after the previous start date (not end date). Is this how you understand it and provided the code for or should the code be modified to reflect this criteria?
@wj2 My understanding for each Patientid is
1. Current row's start - previous row's end >= 21
Please let me know your requirement
It should actually be: current row's start_date - previous row's start_date = >=21
How do you suggest modifying the code to meet this criteria?
Ok @wj2 At 2:27 Central Time Chicago, I am starving. So I am stepping out for lunch. Will be back in like 30 mins or so to modify the code. Bear with me
Hi @wj2
Ok, The only change is
k=lag(start_date);
instead of
k=lag(end_date);
Please test and let me know
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.