BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ssulli11
Calcite | Level 5

Hello,

 

I am new to this board so apologies if I don't yet know the proper etiquette for posting. I have found a couple questions posed that are similar to mine in nature, but none is quite what I am looking for.  I don't want to waste anyone's time so if I have missed another solution please point me in the right direction and I will be happy to try those out!

 

I am looking to find the total days overlapped between two drug classes throughout 1 year.  There may be multiple periods of overlap throughout the year, however, I want to sum the total days of overlap into one final number.  I would also like to include the first day of overlap. 

 

I have already broken my data down into years so each file is one complete year.  Below is a simplified version of my data with what I currently 'have' and the output I 'want'.

 

HAVEHAVEHAVEHAVEHAVEWANTWANT
IDYearRxStartRxEndDrugClassOverlap (days)1st Day of Overlap
12009178981792803017898
12009178981792813017898
12009181411817103017898
12009181711820103017898
22009180791811511518110
22009181101814001518110
22009182151823011518110
22009182201823401518110
32009180891812500.
32009181261814600.
32009181561818610.
32009181861819610.

 

Any and all help is appreciated!  Thank you for your time.

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

Using 1-year arrays:

 

data have;
input ID Year RxStart RxEnd DrugClass;
format RxStart RxEnd yymmdd10.;
datalines; 
1 2009 17898 17928 0  
1 2009 17898 17928 1  
1 2009 18141 18171 0  
1 2009 18171 18201 0  
2 2009 18079 18115 1  
2 2009 18110 18140 0  
2 2009 18215 18230 1  
2 2009 18220 18234 0  
3 2009 18089 18125 0 
3 2009 18126 18146 0 
3 2009 18156 18186 1 
3 2009 18186 18196 1 
;

proc sort data=have; by year ID; run;

data want;
array c0{0:366};
array c1{0:366};
do until(last.ID);
    set have; by year id;
    do date = rxStart to rxEnd;
        i = intck("day", mdy(1, 1, year), date);
        if drugClass = 0 
            then c0{i} = 1;
            else c1{i} = 1;
        end;
    end;
overlap = 0;
do i = 0 to 366;
    if c0{i} and c1{i} then do;
        overlap = overlap + 1;
        if missing(firstDay) 
            then firstDay = intnx("day", mdy(1, 1, year), i); 
        end;
    end;
do until(last.ID);
    set have; by year id;
    output;
    end;
format firstDay yymmdd10.;
drop c0: c1: date i;
run;

proc print data=want noobs; run;

PG

View solution in original post

3 REPLIES 3
PGStats
Opal | Level 21

Using 1-year arrays:

 

data have;
input ID Year RxStart RxEnd DrugClass;
format RxStart RxEnd yymmdd10.;
datalines; 
1 2009 17898 17928 0  
1 2009 17898 17928 1  
1 2009 18141 18171 0  
1 2009 18171 18201 0  
2 2009 18079 18115 1  
2 2009 18110 18140 0  
2 2009 18215 18230 1  
2 2009 18220 18234 0  
3 2009 18089 18125 0 
3 2009 18126 18146 0 
3 2009 18156 18186 1 
3 2009 18186 18196 1 
;

proc sort data=have; by year ID; run;

data want;
array c0{0:366};
array c1{0:366};
do until(last.ID);
    set have; by year id;
    do date = rxStart to rxEnd;
        i = intck("day", mdy(1, 1, year), date);
        if drugClass = 0 
            then c0{i} = 1;
            else c1{i} = 1;
        end;
    end;
overlap = 0;
do i = 0 to 366;
    if c0{i} and c1{i} then do;
        overlap = overlap + 1;
        if missing(firstDay) 
            then firstDay = intnx("day", mdy(1, 1, year), i); 
        end;
    end;
do until(last.ID);
    set have; by year id;
    output;
    end;
format firstDay yymmdd10.;
drop c0: c1: date i;
run;

proc print data=want noobs; run;

PG
FreelanceReinh
Jade | Level 19

Hello @ssulli11,

 

Your overlap counts (30, 15) suggest that RxEnd is the day after the last drug administration (in the respective period). If this is the case, you should adapt the relevant DO loop in PG's solution accordingly:

do date = rxStart to rxEnd-1;
novinosrin
Tourmaline | Level 20
data have;
input ID Year RxStart RxEnd DrugClass;
format RxStart RxEnd yymmdd10.;
datalines; 
1 2009 17898 17928 0  
1 2009 17898 17928 1  
1 2009 18141 18171 0  
1 2009 18171 18201 0  
2 2009 18079 18115 1  
2 2009 18110 18140 0  
2 2009 18215 18230 1  
2 2009 18220 18234 0  
3 2009 18089 18125 0 
3 2009 18126 18146 0 
3 2009 18156 18186 1 
3 2009 18186 18196 1 
;

data want;
do until(last.id);
	set have;
	by id;
	k=lag(rxend);
	if first.id then do; flag=0;Day1stofOverlap=.;Overlap=0;end;
	if id=lag(id) and lag(drugclass) ne drugclass and flag eq 0 and rxstart<k then do; 
	dif=k-rxstart;flag=1;
	if not Day1stofOverlap then Day1stofOverlap=rxstart;
end;
else flag=0;
Overlap+dif;
call missing(dif);
end;
do until(last.id);
	set have;
	by id;
	output;
end;
format  Day1stofOverlap yymmdd10.;
drop k dif;
run;

A request plz:

If the above works or does not work let me know a feedback be it positive or negative. The logic was given to me by a Python expert college mate of mine by the name Mr. Jim Scully  at DePaul university, Chicago. I merely translated his logic to SAS syntax. We look forward to your feedback for our continuous learning and improvement. Thank you

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 1752 views
  • 1 like
  • 4 in conversation