Help using Base SAS procedures

dealing with time periods with start and end date and overlap

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 128
Accepted Solution

dealing with time periods with start and end date and overlap

[ Edited ]
drug   start                    end
a        01/01/2001    02/15/2001
b        01/02/2001    02/10/2001
c        04/05/2002     06/07/2002
d        04/15/2002     07/01/2002
e       04/01/2002      07/18/2002
f       04/05/2002      06/01/2002 

I want to report drug that overlapped by 30 days or more. 

Output: 

a-b

c-d-e-f 


Accepted Solutions
Solution
‎12-05-2016 08:37 AM
Super User
Posts: 10,020

Re: dealing with time periods with start and end date and overlap

Posted in reply to lillymaginta
OK. So you also care about END-START ?




data have;
   input id drug  $ start : mmddyy10.    end : mmddyy10.;
   format start end mmddyy10.;
cards;
1 c        04/05/2002     06/07/2002
1 d        04/15/2002     04/25/2002
1 e       04/15/2002      04/18/2002
2 a        01/01/2002    02/15/2002
2 b        01/05/2002    03/19/2002
2 c        01/24/2002     02/17/2002
2 d        04/10/2002     07/01/2002
2 e       04/01/2002      07/28/2002
2 f       04/25/2002      06/30/2002
;
run;

data temp;
   set have;
   by id;
   dif=lag(end)-start;
   if dif lt 30 or (end-start) lt 30 or first.id then group+1;
run;

data want;
   length want $ 200;
   do i=1 by 1 until(last.group);
      set Temp;
      by id group;
      want=catx('-',want,drug);
   end;
   if i ne 1 then output;
   keep id want;
run;

View solution in original post


All Replies
Respected Advisor
Posts: 4,919

Re: dealing with time periods with start and end date and overlap

Posted in reply to lillymaginta

There could be many interpretations to your sample data. To help clarify things, what would you expect for the data:

 

a 2002/01/01 2002/01/31
b 2002/01/01 2002/12/31
c 2002/12/01 2002/12/31
d 2002/06/01 2003/06/30
PG
Frequent Contributor
Posts: 128

Re: dealing with time periods with start and end date and overlap

Ignoring patients ids, I am just trying to get aggregate numbers over a large data. I want to know the total number of prescriptions that overlap for 30 days or more. 

If two medications overlap, then 2 

if three medications overlap 3. 

The output above shows the classes that overlaped. so for 2 a-b and so on 

Super User
Posts: 10,020

Re: dealing with time periods with start and end date and overlap

Posted in reply to lillymaginta

data have;
input drug  $ start : mmddyy10.    end : mmddyy10.;
format start end mmddyy10.;
cards;
a        01/01/2001    02/15/2001
b        01/02/2001    02/10/2001
c        04/05/2002     06/07/2002
d        04/15/2002     07/01/2002
e       04/01/2002      07/18/2002
f       04/05/2002      06/01/2002 
;
run;

data have;
 set have;
 dif=lag(end)-start;
 if dif lt 30 then group+1;
run;
data want;
length want $ 200;
 do i=1 by 1 until(last.group);
  set have;
  by group;
  want=catx('-',want,drug);
 end;
if i ne 1 then output;
keep want;
run;


Frequent Contributor
Posts: 128

Re: dealing with time periods with start and end date and overlap

KSHARP thank you! your code works perfect for the small dataset that has multiple medications for the same patient. Can one adjust the code if you have multiple ids? How can one modify the code if you have something like this with the first column be pt id? 

 

1 a        01/01/2001    02/15/2001
1 b        01/02/2001    02/10/2001
1 c        04/05/2002     06/07/2002
1 d        04/15/2002     07/01/2002
1 e       04/01/2002      07/18/2002
1 f       04/05/2002      06/01/2002
2 a        01/01/2001    02/15/2001
2 b        01/02/2001    02/10/2001
2 c        04/05/2002     06/07/2002
2 d        04/15/2002     07/01/2002
2 e       04/01/2002      07/18/2002
2 f       04/05/2002      06/01/2002

 

Super User
Posts: 11,343

Re: dealing with time periods with start and end date and overlap

Posted in reply to lillymaginta

Perhaps something like this if you want the groups within id, which makes more sense than across id.

data have;
   input id drug  $ start : mmddyy10.    end : mmddyy10.;
   format start end mmddyy10.;
cards;
1 a        01/01/2001    02/15/2001
1 b        01/02/2001    02/10/2001
1 c        04/05/2002     06/07/2002
1 d        04/15/2002     07/01/2002
1 e       04/01/2002      07/18/2002
1 f       04/05/2002      06/01/2002
2 a        01/01/2002    02/15/2002
2 b        01/05/2002    03/19/2002
2 c        01/24/2002     02/17/2002
2 d        04/10/2002     07/01/2002
2 e       04/01/2002      07/28/2002
2 f       04/25/2002      06/30/2002
;
run;

data temp;
   set have;
   by id;
   dif=lag(end)-start;

   if first.id then do;
      group=.;
      dif =.;
   end;
   if dif lt 30 then group+1;
run;

data want;
   length want $ 200;
   do i=1 by 1 until(last.group);
      set Temp;
      by id group;
      want=catx('-',want,drug);
   end;
   if i ne 1 then output;
   keep id want;
run;

I changed some date values for id=2 so that the results differed between Id. You need to decide if the rules have been applied correctly.

 

 

If your main data isn't sorted by ID and start date it should be before doing the step that creates temp.

Frequent Contributor
Posts: 128

Re: dealing with time periods with start and end date and overlap

Super! Thank you 

Frequent Contributor
Posts: 128

Re: dealing with time periods with start and end date and overlap

I have one more questions. 

Lets assume the output for id is the following:

1 a-b

1 a-b-c

1 a-b-c-d 

1 a-b-c-e

2 a-b

2-c-d 

How can I create table the keeps only the highest combination for each id. I mean here we have a combintion of two drugs, three drugs, and four drugs. 

I want to create a table that tell me the highest combination for patient 1 was 4 and for patient 2 is 2 and so on. Here is the output 

1  4

2 2 

 

Super User
Posts: 10,020

Re: dealing with time periods with start and end date and overlap

Posted in reply to lillymaginta
data x;
input a b $20.;
cards;
1 a-b
1 a-b-c
1 a-b-c-d 
1 a-b-c-e
2 a-b
2 c-d 
;
run;
proc sql;
select a,max(countc(b,'-'))+1 as max
 from x
  group by a;
quit;

Super User
Posts: 10,020

Re: dealing with time periods with start and end date and overlap

Posted in reply to lillymaginta
Yeah. Try @ballardw 's code , or this one:


data have;
   input id drug  $ start : mmddyy10.    end : mmddyy10.;
   format start end mmddyy10.;
cards;
1 a        01/01/2001    02/15/2001
1 b        01/02/2001    02/10/2001
1 c        04/05/2002     06/07/2002
1 d        04/15/2002     07/01/2002
1 e       04/01/2002      07/18/2002
1 f       04/05/2002      06/01/2002
2 a        01/01/2002    02/15/2002
2 b        01/05/2002    03/19/2002
2 c        01/24/2002     02/17/2002
2 d        04/10/2002     07/01/2002
2 e       04/01/2002      07/28/2002
2 f       04/25/2002      06/30/2002
;
run;

data temp;
   set have;
   by id;
   dif=lag(end)-start;
   if dif lt 30 or first.id then group+1;
run;

data want;
   length want $ 200;
   do i=1 by 1 until(last.group);
      set Temp;
      by id group;
      want=catx('-',want,drug);
   end;
   if i ne 1 then output;
   keep id want;
run;

Frequent Contributor
Posts: 128

Re: dealing with time periods with start and end date and overlap

Thank you ksharp, I just realized that the output of the first code is incorrect . The 30 days overlap should be 30 days or more and should his card drugs that overlap in less than 30 days. For example, for the entries below, the code should not retain any drugs for this patient because there is no overlap of 30 days or more between the different drugs. 

1 c        04/05/2002     06/07/2002
1 d        04/15/2002     04/25/2002
1 e       04/15/2002      04/18/2002

 

Solution
‎12-05-2016 08:37 AM
Super User
Posts: 10,020

Re: dealing with time periods with start and end date and overlap

Posted in reply to lillymaginta
OK. So you also care about END-START ?




data have;
   input id drug  $ start : mmddyy10.    end : mmddyy10.;
   format start end mmddyy10.;
cards;
1 c        04/05/2002     06/07/2002
1 d        04/15/2002     04/25/2002
1 e       04/15/2002      04/18/2002
2 a        01/01/2002    02/15/2002
2 b        01/05/2002    03/19/2002
2 c        01/24/2002     02/17/2002
2 d        04/10/2002     07/01/2002
2 e       04/01/2002      07/28/2002
2 f       04/25/2002      06/30/2002
;
run;

data temp;
   set have;
   by id;
   dif=lag(end)-start;
   if dif lt 30 or (end-start) lt 30 or first.id then group+1;
run;

data want;
   length want $ 200;
   do i=1 by 1 until(last.group);
      set Temp;
      by id group;
      want=catx('-',want,drug);
   end;
   if i ne 1 then output;
   keep id want;
run;

Frequent Contributor
Posts: 128

Re: dealing with time periods with start and end date and overlap

Thank you so much Ksharp!! five starts! 

Frequent Contributor
Posts: 128

Re: dealing with time periods with start and end date and overlap

one more question, after the first code, how can I exclude drugs that overlap for the same drug. For example if I have the following outpit:

a-b-c-a, I want to drop the duplicate a and keep it as a-b-c

the same for a-a, I dont want to count the overlap if it is the same drug . Thank you again for all the help! 

Super User
Posts: 10,020

Re: dealing with time periods with start and end date and overlap

Posted in reply to lillymaginta
No problem.




data have;
   input id drug  $ start : mmddyy10.    end : mmddyy10.;
   format start end mmddyy10.;
cards;
1 c        04/05/2002     06/07/2002
1 d        04/15/2002     04/25/2002
1 e       04/15/2002      04/18/2002
2 a        01/01/2002    02/15/2002
2 b        01/05/2002    03/19/2002
2 c        01/24/2002     02/17/2002
2 d        04/10/2002     07/01/2002
2 e       04/01/2002      07/28/2002
2 f       04/25/2002      06/30/2002
2 f       04/25/2002      07/30/2002
;
run;

data temp;
   set have;
   by id;
   dif=lag(end)-start;
   if dif lt 30 or (end-start) lt 30 or first.id then group+1;
run;

data want;
array x{99999} $ 100 _TEMPORARY_;
n=0;
call missing(of x{*});
length want $ 200;

   do i=1 by 1 until(last.group);
      set Temp;
      by id group;
      if drug not in x then do;
       n+1;x{n}=drug;
       want=catx('-',want,drug);
      end;
   end;
   if i ne 1 then output;
   keep id want;
run;

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 15 replies
  • 536 views
  • 5 likes
  • 4 in conversation