DATA Step, Macro, Functions and more

Counting medications by id within 90 days

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 13
Accepted Solution

Counting medications by id within 90 days

Hi - I am trying to count the number of medications in this sample data by id and within 90 days from the start date for each prescription.

For example: take the first record, add 90 days to get the end date as 04/19/12. Now start counting # of meds (med_cnt) until the start date reaches this end date 04/19/12, then go to the next record and start over again until last.id.

Any help is greatly appreciated!

Here is the sample data:

data med;

input id $ start_dt:mmddyy8. med;

cards;

001 01/20/12 1

001 01/31/12 1

001 02/17/12 1

001 04/25/12 1

002 02/02/12 1

002 05/01/12 2

;

The output would be something like this:

id        start_dt     med     end_dt     med_cnt

001     01/20/12     1     04/19/12     1

001     01/31/12     1     04/30/12     2

001     02/17/12     1     05/17/12     3

001     04/25/12     1     07/24/12     3   ==> as soon as start date 4/5 > the first end date 4/19, then go to the 2nd observation and start again.

001     10/11/12     1     01/09/14     1

002     02/02/12     1     05/03/12     1

002     05/01/13     2     07/30/12     3


Accepted Solutions
Solution
‎05-23-2014 04:50 PM
Respected Advisor
Posts: 4,651

Re: Counting medications by id within 90 days

A datastep alternative, using random access:

data want3;

set med;

do point=_n_-1 to 1 by -1;

  set med(keep=id start_date med

       rename=(id=same_id start_date=other_date med=other_med))

       point=point;

  if id ne same_id or

       intck("DAY", other_date, start_date) > 89 then leave;

  med + other_med;

  end;

keep id start_date med;

rename med=total_med;

run;

PG

PG

View solution in original post


All Replies
Respected Advisor
Posts: 3,895

Re: Counting medications by id within 90 days

One way to go

data med;
  input id $ start_date:mmddyy8. med;
  format start_date date9.;
  cards;
001 01/20/12 1
001 01/31/12 1
001 02/17/12 1
001 04/25/12 1
001 10/11/12 1
002 02/02/12 1
002 05/01/12 2
;
run;

proc sql;
  create table want as
    select o.*
      ,(select sum(i.med) from med i where o.id=i.id and 0 <= o.start_date-i.start_date <= 90) as med_count
    from med o
  ;
quit;

Occasional Contributor
Posts: 13

Re: Counting medications by id within 90 days

Hi Patrick - thanks for your help, the code works well, but it's been running for hours with 1.2 milllions prescriptions.

I was trying to do it with the data step approach and it's been a challenge. Will  a "do until loop" do the trick? Thanks again!

Respected Advisor
Posts: 4,651

Re: Counting medications by id within 90 days

Not obvious at all. You may try the alternate query:

proc sql;

create table want2 as

select m1.id, m1.start_date, m1.med, sum(m2.med) as medCount

from

  med as m1 inner join

  med as m2 on

       m1.id=m2.id and

            m2.start_date between intnx("DAY", m1.start_date, -90) and m1.start_date

group by m1.id, m1.start_date, m1.med;

quit;

which may or may not be more optimizable by SAS.

Good luck.

PG

PG
Solution
‎05-23-2014 04:50 PM
Respected Advisor
Posts: 4,651

Re: Counting medications by id within 90 days

A datastep alternative, using random access:

data want3;

set med;

do point=_n_-1 to 1 by -1;

  set med(keep=id start_date med

       rename=(id=same_id start_date=other_date med=other_med))

       point=point;

  if id ne same_id or

       intck("DAY", other_date, start_date) > 89 then leave;

  med + other_med;

  end;

keep id start_date med;

rename med=total_med;

run;

PG

PG
Respected Advisor
Posts: 3,895

Re: Counting medications by id within 90 days

Hi Kyle

The SQL I've posted is executing the expression in the Select clause for every single row. That is very inefficient. Sorry about that. I didn't expect you to have that many rows.

I've done some testing with 1.2M rows. The SQL Pierre posted is much more efficient (takes around 56 seconds real time on my laptop) but as expected the data step is much faster (even if you would have to pre-sort the data). The sort takes me only 0.18 seconds of real time and the data step 1.6 seconds.

Below the code I've used:

data med;
  attrib id length=8 start_date format=date9. med length=8;

  do id=1 to 10000;
    start_date='01jan2000'd;

    do while(start_date<today());
      start_date=start_date+ceil(ranuni(1)*30);
      med=ceil(ranuni(2)*1.2);
      output;
      _n+1;
    end;

    if _n>1200000 then
      stop;
  end;
run;

proc sql;
  create table want2 as
  select m1.id, m1.start_date, m1.med, sum(m2.med) as medCount
  from med as m1 inner join med as m2 on
    m1.id=m2.id
    and m2.start_date between intnx("DAY", m1.start_date, -90) and m1.start_date
  group by m1.id, m1.start_date, m1.med;
quit;

proc sort data=med out=med;
  by id start_date;
run;

data want3;
  set med;
  do point=_n_-1 to 1 by -1;
    set med(keep=id start_date med
      rename=(id=same_id start_date=other_date med=other_med))
      point=point;
    if id ne same_id or  intck("DAY", other_date, start_date) > 89 then
      leave;
    med + other_med;
  end;
  keep id start_date med;
  rename med=total_med;
run;

Thanks

Patrick

Respected Advisor
Posts: 4,651

Re: Counting medications by id within 90 days

Thanks for the test . It confirms my intuition. - PG

PG
Occasional Contributor
Posts: 13

Re: Counting medications by id within 90 days

I have tested all three methods and they all work fine. The last two were on a data sample, I have not had a chance to test them on the 1.2M data yet. I will let you know. Thank both so much for your help. Greatly appreciated!

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 509 views
  • 7 likes
  • 3 in conversation