Solved: How do I count total days of overlap?

ssulli11 · Posted 09-16-2018 11:10 PM

Hello,

I am new to this board so apologies if I don't yet know the proper etiquette for posting. I have found a couple questions posed that are similar to mine in nature, but none is quite what I am looking for. I don't want to waste anyone's time so if I have missed another solution please point me in the right direction and I will be happy to try those out!

I am looking to find the total days overlapped between two drug classes throughout 1 year. There may be multiple periods of overlap throughout the year, however, I want to sum the total days of overlap into one final number. I would also like to include the first day of overlap.

I have already broken my data down into years so each file is one complete year. Below is a simplified version of my data with what I currently 'have' and the output I 'want'.

HAVE	HAVE	HAVE	HAVE	HAVE	WANT	WANT
ID	Year	RxStart	RxEnd	DrugClass	Overlap (days)	1st Day of Overlap
1	2009	17898	17928	0	30	17898
1	2009	17898	17928	1	30	17898
1	2009	18141	18171	0	30	17898
1	2009	18171	18201	0	30	17898
2	2009	18079	18115	1	15	18110
2	2009	18110	18140	0	15	18110
2	2009	18215	18230	1	15	18110
2	2009	18220	18234	0	15	18110
3	2009	18089	18125	0	0	.
3	2009	18126	18146	0	0	.
3	2009	18156	18186	1	0	.
3	2009	18186	18196	1	0	.

Any and all help is appreciated! Thank you for your time.

PGStats · Posted 09-17-2018 12:33 AM

Using 1-year arrays:

data have;
input ID Year RxStart RxEnd DrugClass;
format RxStart RxEnd yymmdd10.;
datalines; 
1 2009 17898 17928 0  
1 2009 17898 17928 1  
1 2009 18141 18171 0  
1 2009 18171 18201 0  
2 2009 18079 18115 1  
2 2009 18110 18140 0  
2 2009 18215 18230 1  
2 2009 18220 18234 0  
3 2009 18089 18125 0 
3 2009 18126 18146 0 
3 2009 18156 18186 1 
3 2009 18186 18196 1 
;

proc sort data=have; by year ID; run;

data want;
array c0{0:366};
array c1{0:366};
do until(last.ID);
    set have; by year id;
    do date = rxStart to rxEnd;
        i = intck("day", mdy(1, 1, year), date);
        if drugClass = 0 
            then c0{i} = 1;
            else c1{i} = 1;
        end;
    end;
overlap = 0;
do i = 0 to 366;
    if c0{i} and c1{i} then do;
        overlap = overlap + 1;
        if missing(firstDay) 
            then firstDay = intnx("day", mdy(1, 1, year), i); 
        end;
    end;
do until(last.ID);
    set have; by year id;
    output;
    end;
format firstDay yymmdd10.;
drop c0: c1: date i;
run;

proc print data=want noobs; run;

PG

View solution in original post

PGStats · Posted 09-17-2018 12:33 AM

Using 1-year arrays:

data have;
input ID Year RxStart RxEnd DrugClass;
format RxStart RxEnd yymmdd10.;
datalines; 
1 2009 17898 17928 0  
1 2009 17898 17928 1  
1 2009 18141 18171 0  
1 2009 18171 18201 0  
2 2009 18079 18115 1  
2 2009 18110 18140 0  
2 2009 18215 18230 1  
2 2009 18220 18234 0  
3 2009 18089 18125 0 
3 2009 18126 18146 0 
3 2009 18156 18186 1 
3 2009 18186 18196 1 
;

proc sort data=have; by year ID; run;

data want;
array c0{0:366};
array c1{0:366};
do until(last.ID);
    set have; by year id;
    do date = rxStart to rxEnd;
        i = intck("day", mdy(1, 1, year), date);
        if drugClass = 0 
            then c0{i} = 1;
            else c1{i} = 1;
        end;
    end;
overlap = 0;
do i = 0 to 366;
    if c0{i} and c1{i} then do;
        overlap = overlap + 1;
        if missing(firstDay) 
            then firstDay = intnx("day", mdy(1, 1, year), i); 
        end;
    end;
do until(last.ID);
    set have; by year id;
    output;
    end;
format firstDay yymmdd10.;
drop c0: c1: date i;
run;

proc print data=want noobs; run;

PG

FreelanceReinh · Posted 09-17-2018 05:12 AM

Hello @ssulli11,

Your overlap counts (30, 15) suggest that RxEnd is the day after the last drug administration (in the respective period). If this is the case, you should adapt the relevant DO loop in PG's solution accordingly:

do date = rxStart to rxEnd-1;

novinosrin · Posted 09-17-2018 12:44 PM

data have;
input ID Year RxStart RxEnd DrugClass;
format RxStart RxEnd yymmdd10.;
datalines; 
1 2009 17898 17928 0  
1 2009 17898 17928 1  
1 2009 18141 18171 0  
1 2009 18171 18201 0  
2 2009 18079 18115 1  
2 2009 18110 18140 0  
2 2009 18215 18230 1  
2 2009 18220 18234 0  
3 2009 18089 18125 0 
3 2009 18126 18146 0 
3 2009 18156 18186 1 
3 2009 18186 18196 1 
;

data want;
do until(last.id);
	set have;
	by id;
	k=lag(rxend);
	if first.id then do; flag=0;Day1stofOverlap=.;Overlap=0;end;
	if id=lag(id) and lag(drugclass) ne drugclass and flag eq 0 and rxstart<k then do; 
	dif=k-rxstart;flag=1;
	if not Day1stofOverlap then Day1stofOverlap=rxstart;
end;
else flag=0;
Overlap+dif;
call missing(dif);
end;
do until(last.id);
	set have;
	by id;
	output;
end;
format  Day1stofOverlap yymmdd10.;
drop k dif;
run;

A request plz:

If the above works or does not work let me know a feedback be it positive or negative. The logic was given to me by a Python expert college mate of mine by the name Mr. Jim Scully at DePaul university, Chicago. I merely translated his logic to SAS syntax. We look forward to your feedback for our continuous learning and improvement. Thank you

How do I count total days of overlap?

Re: How do I count total days of overlap?

Re: How do I count total days of overlap?

Re: How do I count total days of overlap?

Re: How do I count total days of overlap?

How do I count total days of overlap?

Re: How do I count total days of overlap?

Re: How do I count total days of overlap?

Re: How do I count total days of overlap?

Re: How do I count total days of overlap?

Registration is open