DATA Step, Macro, Functions and more

frequency of multiple observations hapenning at the same time

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 110
Accepted Solution

frequency of multiple observations hapenning at the same time

id rx  date_start     days_supply  

1  a 1/1/2005         60  

1  b 12/1/2004       90

1  c  1/8/2005        30

1  c  4/4/2005        30

1  c  5/5/2006        30 

2  a  1/1/2005       60 

2  b  1/28/2005     30

2  a   1/1/2006     60 

2  b   1/28/2006   30 

3 a    1/1/2009     30

3 d    3/4/2009     30 

 

 

you can create an rx end date by adding rx_Date + supply= end_Date 

I want to find frequency of drugs that overlaped by 30 days or more, those on one, those on more 

each patient would be counted once per each drug mono, or drug dual comb

output:

a 1

c 1

d  1

a-b 1

a-b-c 1 

 

I hope this is easy to understand and that someone can help. Thanks 

 

 


Accepted Solutions
Solution
‎04-21-2016 05:38 PM
Super User
Posts: 10,474

Re: frequency of multiple observations hapenning at the same time

This is not a trivial exercise.

Here is my attempt but I am still not clear on some of your overlap rules.

The last dataset below, named SUBSET, has one record for each ID Rx combination and the date that the count for that combination reached 30 days of "overlap". Since your ID 2 as two instances of a-b overlapping that are essentially a year apart I am not sure how you are counting those. To me it seems that a gap like that would be a separate count. What if the gap had been 5 years? or 10?

 

Note that you have the potential of having more than 3 drugs overlap then this gets much more complicated as you would be looking at creating more separate categories and adding more exclusion rules (IF I understood them at all)

 

data have;
   informat id rx $4. date_start mmddyy10.;
   format date_start mmddyy10.;
   input id rx  date_start     days_supply  ;
datalines;
1  a 1/1/2005         60  
1  b 12/1/2004       90
1  c  1/8/2005        30
1  c  4/4/2005        30
1  c  5/5/2006        30 
2  a  1/1/2005       60 
2  b  1/28/2005     30
2  a   1/1/2006     60 
2  b   1/28/2006   30 
3 a    1/1/2009     30
3 d    3/4/2009     30 
;
run;
/* get one day for each ID Rx day of dose*/
data temp;
   set have;
   format dose_date mmddyy10.;
   do dose_date = date_start to (date_start + Days_supply -1);
      output;
   end;
run;

/* prepare to transpose*/
proc sort data=temp; by id dose_date rx ;run;
/* transpose, result will have one record per day
with Col1, Col2 ...Coln the rx. The n is determined
by the max number of RX on a single day encountered
*/
proc transpose data=temp out=trans;
   by id dose_date;
   var rx;
run;
/* create your categories, if you have more than 3 rx
you may need to get into th lexcomb functions to generate all the 
2 and 3 level variables
*/
data temp2;
   set trans;
   by id;
   length RxCat1 RxCat2 RxCat3 LastRxCat1 LastRxCat2 LastRxCat3 $ 10 ;
   /* You haven't exactly specified rules for overlap.
   If you have a 3 drug overlap, do you also include the
   2 or single? 
   This out put will have separate 1 , 2 and 3 overlap variables.
   */
   RxCat1 = col1;
   if not missing(col2) then RxCat2 = catx('-',col1,col2);
   else call missing(RxCat2);
   if not missing(col3) then RxCat3 = catx('-',col1,col2,col3);
   else call missing(RxCat3);
   retain LastRxCat1 LastRxCat2 LastRxCat3 '';
   if first.id then call missing( LastRxCat1, LastRxCat2, LastRxCat,CountRxCat1, CountRxCat2, CountRxCat3);
   /* the counting rule appears to be IF the RxCat1 is not part of a 2 or 3 level then count
      otherwise not
   */ 
   If not missing(RxCat1) then do; 
      if RxCat1 = LastRxCat1 and missing(RxCat2) and Missing(RxCat3) then CountRxCat1+1;
      Else CountRxCat1 = 1;
   end;
   Else CountRxCat1=0;
   /* and only count the RxCat2 if missing RxCat3*/
   If not missing(RxCat2) then do; 
      if RxCat2 = LastRxCat2 and missing(RxCat3) then CountRxCat2+1;
      Else CountRxCat2 = 1;
   end;
   Else CountRxCat2=0;
   If not missing(RxCat3) then do; 
      if RxCat3 = LastRxCat3 then CountRxCat3+1;
      Else CountRxCat3 = 1;
   end;
   Else CountRxCat3=0;

   LastRxCat1=RxCat1;
   LastRxCat2=RxCat2;
   LastRxCat3=RxCat3;
run;

data subset;
   set temp2;
   if CountRxCat1=30 or CountRxCat2=30 or CountRxCat3=30;
   Array c CountRxCat1 - CountRxCat3;
   length category $ 10;
   Array r RxCat1 - RxCat3;
   category = R[ whichn(30,of c(*))];
   keep  id dose_date Category ;
run;



View solution in original post


All Replies
Solution
‎04-21-2016 05:38 PM
Super User
Posts: 10,474

Re: frequency of multiple observations hapenning at the same time

This is not a trivial exercise.

Here is my attempt but I am still not clear on some of your overlap rules.

The last dataset below, named SUBSET, has one record for each ID Rx combination and the date that the count for that combination reached 30 days of "overlap". Since your ID 2 as two instances of a-b overlapping that are essentially a year apart I am not sure how you are counting those. To me it seems that a gap like that would be a separate count. What if the gap had been 5 years? or 10?

 

Note that you have the potential of having more than 3 drugs overlap then this gets much more complicated as you would be looking at creating more separate categories and adding more exclusion rules (IF I understood them at all)

 

data have;
   informat id rx $4. date_start mmddyy10.;
   format date_start mmddyy10.;
   input id rx  date_start     days_supply  ;
datalines;
1  a 1/1/2005         60  
1  b 12/1/2004       90
1  c  1/8/2005        30
1  c  4/4/2005        30
1  c  5/5/2006        30 
2  a  1/1/2005       60 
2  b  1/28/2005     30
2  a   1/1/2006     60 
2  b   1/28/2006   30 
3 a    1/1/2009     30
3 d    3/4/2009     30 
;
run;
/* get one day for each ID Rx day of dose*/
data temp;
   set have;
   format dose_date mmddyy10.;
   do dose_date = date_start to (date_start + Days_supply -1);
      output;
   end;
run;

/* prepare to transpose*/
proc sort data=temp; by id dose_date rx ;run;
/* transpose, result will have one record per day
with Col1, Col2 ...Coln the rx. The n is determined
by the max number of RX on a single day encountered
*/
proc transpose data=temp out=trans;
   by id dose_date;
   var rx;
run;
/* create your categories, if you have more than 3 rx
you may need to get into th lexcomb functions to generate all the 
2 and 3 level variables
*/
data temp2;
   set trans;
   by id;
   length RxCat1 RxCat2 RxCat3 LastRxCat1 LastRxCat2 LastRxCat3 $ 10 ;
   /* You haven't exactly specified rules for overlap.
   If you have a 3 drug overlap, do you also include the
   2 or single? 
   This out put will have separate 1 , 2 and 3 overlap variables.
   */
   RxCat1 = col1;
   if not missing(col2) then RxCat2 = catx('-',col1,col2);
   else call missing(RxCat2);
   if not missing(col3) then RxCat3 = catx('-',col1,col2,col3);
   else call missing(RxCat3);
   retain LastRxCat1 LastRxCat2 LastRxCat3 '';
   if first.id then call missing( LastRxCat1, LastRxCat2, LastRxCat,CountRxCat1, CountRxCat2, CountRxCat3);
   /* the counting rule appears to be IF the RxCat1 is not part of a 2 or 3 level then count
      otherwise not
   */ 
   If not missing(RxCat1) then do; 
      if RxCat1 = LastRxCat1 and missing(RxCat2) and Missing(RxCat3) then CountRxCat1+1;
      Else CountRxCat1 = 1;
   end;
   Else CountRxCat1=0;
   /* and only count the RxCat2 if missing RxCat3*/
   If not missing(RxCat2) then do; 
      if RxCat2 = LastRxCat2 and missing(RxCat3) then CountRxCat2+1;
      Else CountRxCat2 = 1;
   end;
   Else CountRxCat2=0;
   If not missing(RxCat3) then do; 
      if RxCat3 = LastRxCat3 then CountRxCat3+1;
      Else CountRxCat3 = 1;
   end;
   Else CountRxCat3=0;

   LastRxCat1=RxCat1;
   LastRxCat2=RxCat2;
   LastRxCat3=RxCat3;
run;

data subset;
   set temp2;
   if CountRxCat1=30 or CountRxCat2=30 or CountRxCat3=30;
   Array c CountRxCat1 - CountRxCat3;
   length category $ 10;
   Array r RxCat1 - RxCat3;
   category = R[ whichn(30,of c(*))];
   keep  id dose_date Category ;
run;



Frequent Contributor
Posts: 110

Re: frequency of multiple observations hapenning at the same time

Hi Thank you for taking the time to answer this question. For your comments:

for patient 2: the a-b overlap would be counted once per id (depending on your definition it could be changed)   

in case we have a-b-c overlap this would be counted as 1 triple but would not be counted under a-b or b-c //unless the days supply for 2 of the combination exceed the third then we would have a-b-c and b-c //

I don't need info on drug overlap of more than 3

Your last data subset is the exact output am looking for! Thank you so much! Smiley Very Happy !! 

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 2 replies
  • 260 views
  • 1 like
  • 2 in conversation