Re: conditional overlap of two periods

lillymaginta · Posted 06-22-2017 07:50 AM

I have the following dataset, I need to count each id only once based on the highest order achieved into:

mono

dual

2 or more

(for example: if the same patient have drugs with no overlap, another two drugs overlapped= then count this patient one time under 2 drug overlap or dual) based on the presence of one of the two conditions:

overlap of 60 days or more between drugs or if the drugs overlap at two different time periods by 30 days then count them (for example one period they same two drugs overlap by 30 and in another period by 40 days count this id as dual)

id drug  start end
1  a    1/1/2004  4/4/2004
1 b     2/2/2004   6/6/2004
1 d    1/4/2005  4/4/2005
2  a    3/1/2006    4/2/2006
2  b    2/2/2006     5/3/2006
2  c    2/2/2006      4/4/2006
2  d   2/3/2001       4/4/2001
3  a    3/3/2001      4/3/2001
3  b    3/2/2002      4/2/2002
4  a    6/1/2001      8/2/2001
4  b    6/1/2001       7/7/2001
4  a    2/2/2001       4/4/2001
4  b    2/5/2001       3/28/2001

the output would be

mono or one drug : 1 (patient 3 counted here)

dual or two drugs overlap : 2 (patient 2 and 4 would be counted here)

three or more (patient 1)

I don't need the actual drugs that overlap just a count of the frequency where each patient can be counted once only.

There are a totol of 6 drugs.

RW9 · Posted 06-22-2017 07:57 AM

I don't have time right now to code anything, but a couple of pointers. You can retain an overlap counter by id, increment it based on your logic and start_date between lag(start_date) and lag(end_date). Then output only last.id. This would give you a list of ids and a count of overlaps which you could then count.

Kurt_Bremser · Posted 06-22-2017 08:25 AM

data have;
input id drug $ start :mmddyy10. end :mmddyy10.;
format start end mmddyy10.;
cards;
1  a    1/1/2004  4/4/2004
1 b     2/2/2004   6/6/2004
1 d    1/4/2005  4/4/2005
2  a    3/1/2006    4/2/2006
2  b    2/2/2006     5/3/2006
2  c    2/2/2006      4/4/2006
2  d   2/3/2001       4/4/2001
3  a    3/3/2001      4/3/2001
3  b    3/2/2002      4/2/2002
4  a    6/1/2001      8/2/2001
4  b    6/1/2001       7/7/2001
4  a    2/2/2001       4/4/2001
4  b    2/5/2001       3/28/2001
;
run;

data want1;
set have;
by id;
drugno = 1;
retain count;
if first.id then count = 1; else count + 1;
if lag(end) >= start and count > 1 then drugno = drugno + 1;
if lag2(end) >= start and count > 2 then drugno = drugno + 1;
drugno = min(drugno,3);
run;

proc sql;
create table want2 as select id, max(drugno) as drugno
from want1 group by id;
quit;

proc freq data=want2;
tables drugno;
run;

This detects simple overlaps. If you want the length of the overlaps factored in, you will have to create several drugcounts for different lengths and drugs, so you can then make your further calculations. This is just to show the use of the lag() functions.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

KurtBremser · Posted 06-22-2017 10:19 AM

Thank you KurtBremser, that is very helpful but can you provide a brief example on how to use it to account for the number of days? Thank you!

conditional overlap of two periods