I have prescriptions data where- each patient (ID) may have different prescriptions of the same drug or different drugs.
I want to stitch the prescriptions for the SAME drugs for EACH patient together if the START_DATE of the second prescription falls between the START_DATE and END_DATE of the first prescription or if the START_DATE of the second prescription is within seven days after the END_DATE of the first prescription. Then I want to join the two prescriptions together having the START_DATE of the first prescription and END_DATE of the second prescription. I also want to add the DAYS_SUPP of the two prescriptions(if the first and second prescriptions’ DAYS_SUPP are 5 and 7 days then after stitching the two prescriptions the new DAYS_SUPP will be 12). I also want to keep the second prescription ORDER_DESC as ORDER_DESC2 as shown in the data I want.
If the third prescription’s (of the same DRUG and same patient obviously) START_DATE falls between the START_DATE and END_DATE of the second prescription or if the START_DATE is within seven days after the END_DATE of the second prescription then I also want to join the three prescriptions into one as shown in the data I want. I also want to add the DAYS_SUPP of the three prescriptions together and keep the ORDER_DESC of the third prescription as ORDER_DESC3.
Data I have(sorted by ID, DRUG, START_DATE)
ID |
DRUG |
START_DATE |
END_DATE |
DAYS_SUPP |
ORDER_DESC |
1 |
A |
8/13/2015 |
8/20/2015 |
8 |
ABC |
1 |
A |
8/17/2015 |
8/30/2015 |
14 |
DEF |
1 |
A |
9/5/2015 |
9/15/2015 |
11 |
GHI |
1 |
A |
11/10/2015 |
11/18/2015 |
9 |
RTG |
1 |
B |
5/23/2016 |
5/30/2016 |
8 |
WER |
1 |
B |
6/15/2016 |
6/18/2016 |
4 |
ASD |
1 |
C |
6/15/2016 |
6/22/2016 |
8 |
YYT |
1 |
C |
7/10/2016 |
7/16/2016 |
7 |
ASS |
1 |
C |
7/16/2016 |
7/20/2016 |
5 |
FGH |
1 |
C |
8/22/2016 |
8/25/2016 |
4 |
RTY |
1 |
D |
5/23/2017 |
5/30/2017 |
8 |
RET |
1 |
D |
6/15/2017 |
6/18/2017 |
4 |
RFT |
2 |
A |
8/17/2015 |
8/19/2015 |
3 |
ERT |
2 |
D |
7/16/2016 |
7/19/2016 |
5 |
TYU |
Data I want
ID |
DRUG |
START_DATE |
END_DATE |
DAYS_SUPP |
ORDER_DESC |
ORDER_DESC2 |
ORDER_DESC3 |
1 |
A |
8/13/2015 |
9/15/2015 |
33 |
ABC |
DEF |
GHI |
1 |
A |
11/10/2015 |
11/18/2015 |
9 |
RTG |
||
1 |
B |
5/23/2016 |
5/30/2016 |
8 |
WER |
||
1 |
B |
6/15/2016 |
6/18/2016 |
4 |
ASD |
||
1 |
C |
6/15/2016 |
6/22/2016 |
8 |
YYT |
||
1 |
C |
7/10/2016 |
7/20/2016 |
12 |
ASS |
FGH |
|
1 |
C |
8/22/2016 |
8/25/2016 |
4 |
RTY |
||
1 |
D |
5/23/2017 |
5/30/2017 |
8 |
RET |
||
1 |
D |
6/15/2017 |
6/18/2017 |
4 |
RFT |
||
2 |
A |
8/17/2015 |
8/19/2015 |
3 |
ERT |
||
2 |
D |
7/16/2016 |
7/19/2016 |
5 |
TYU |
Assuming I understood what you mean.
data have;
input ID
DRUG $
START_DATE : mmddyy10.
END_DATE : mmddyy10.
DAYS_SUPP
ORDER_DESC $;
format START_DATE
END_DATE mmddyy10. ;
cards;
1
A
8/13/2015
8/20/2015
8
ABC
1
A
8/17/2015
8/30/2015
14
DEF
1
A
9/5/2015
9/15/2015
11
GHI
1
A
11/10/2015
11/18/2015
9
RTG
1
B
5/23/2016
5/30/2016
8
WER
1
B
6/15/2016
6/18/2016
4
ASD
1
C
6/15/2016
6/22/2016
8
YYT
1
C
7/10/2016
7/16/2016
7
ASS
1
C
7/16/2016
7/20/2016
5
FGH
1
C
8/22/2016
8/25/2016
4
RTY
1
D
5/23/2017
5/30/2017
8
RET
1
D
6/15/2017
6/18/2017
4
RFT
2
A
8/17/2015
8/19/2015
3
ERT
2
D
7/16/2016
7/19/2016
5
TYU
;
data temp;
set have;
by id drug;
if first.drug or start_date-lag(end_date)>7 then group+1;
run;
proc summary data=temp;
by id drug group;
var start_date end_date days_supp;
output out=part1(drop=_:) min(start_date)= max(end_date)= sum(days_supp)=;
run;
proc transpose data=temp out=part2(drop=_:) prefix=order_desc;
by id drug group;
var order_desc;
run;
data want(drop=group);
merge part1 part2;
by id drug group;
run;
Hi Ksharp, Thank you for your reply. I am going to run this code.
-
SR
@SR11 wrote:
Hi,
Thanks again. I tried your code. But after transposing, it created a 1.4TB file for my large data. it made it very difficult to handle in my 3TB virtual machine.
is there any other method to do my analysis?
Thanks.
SR
How big is your initial dataset?
And please post the complete log from the code you ran.
I want to stitch the prescriptions for the SAME drugs for EACH patient together
I am sure you have good reasons for this, but except for very few things, your wanted structure is crap. All starts with the problem, that you can't create variables dynamically. You have to know how many obs will be merged before you start to merge them, which is hardly possible.
Here's an ugly attempt to solve the problem, in future posts i except, that you post data in usable form.
proc sql noprint;
select count_id
into :maxCount trimmed
from (
select count(id) as count_id
from work.have
group by id, drug
)
having count_id = max(count_id)
;
quit;
data want;
set have(rename=(order_desc = _order_desc));
by id drug;
length
order_desc1 - order_desc&maxCount. $ 3
first_start last_start last_end 8
sum_days 8
o_count 8
;
retain order_desc: sum_days o_count first_start;
array descs order_desc1 - order_desc&maxCount.;
format first_start last_start last_end date9.;
last_start = lag(start_date);
last_end = lag(end_date);
if first.drug then do;
call missing(of order_desc:);
sum_days = days_supp;
order_desc1 = _order_desc;
o_count = 2;
first_start = start_date;
end;
else do;
if (last_start <= start_date <= last_date) or (last_end + 7 > start_date) then do;
descs[o_count] = _order_desc;
sum_days = sum_days + days_supp;
o_count = o_count + 1;
end;
else do;
s = start_date;
e = end_date;
d = days_supp;
start_date = first_start;
end_date = last_end;
days_supp = sum_days;
output;
start_date = s;
end_date = e;
days_supp = d;
call missing(of order_desc:);
first_start = s;
sum_days = days_supp;
order_desc1 = _order_desc;
end;
end;
if last.drug then do;
start_date = first_start;
days_supp = sum_days;
output;
end;
drop s e d first_start last_: sum_days _order_desc o_count;
run;
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.