DATA Step, Macro, Functions and more

Count Transactions

Reply
Super Contributor
Posts: 647

Count Transactions

Table A has Id and Process_dt. Table B has Id,Trans_dt,Amount.

How to get number of transactions for each id whose transactions happened in last 30 days of the process_dt and next 30 days of the process_dt and their sum of amount as well.

id  present  no.of trans_under_30   no.of_trans_next_30   total_amount

Column present gives info 'Y' 'N' ,if the id is present in table A or not . Above are columns in the final table required.

Super User
Posts: 9,662

Re: Count Transactions

SQL can do that. Showing your data and the final output will explain your question more clear.

Ksharp

Super Contributor
Posts: 647

Re: Count Transactions

Table A

id             process_dt

11            06JUN2012

12           15AUG2012

13            01SEP2012

Table B

id           Trans_dt              Amount

11          05/15/2012          50

11          05/31/2012          30

11          06/30/012            40

11          07/01/2012          70

12          08/31/2012           50

final table:

id            present              trans_before_30_days              trans_next_30days                     Amount_before        amount_after

11             Y                          2                                             2                                           80                          110

12             Y                          0                                              1                                            0                             50

13             N                          0                                              0                                            0                              0

Super User
Posts: 9,662

Re: Count Transactions

OK.

data a;
input id             process_dt : date9.;
format process_dt  date9.;
cards;
11            06JUN2012
12           15AUG2012
13            01SEP2012
;
run;

data b;
input id           Trans_dt    : mmddyy10.          Amount;
format Trans date9.;
cards;
11          05/15/2012          50
11          05/31/2012          30
11          06/30/012            40
11          07/01/2012          70
12          08/31/2012           50
;
run;
proc sql;
create table want as
 select id,case when id in (select distinct id from b) then 'Y' else 'N' end as present,
        (select count(*) from b where a.id=b.id and b.Trans_dt between a.process_dt-30 and a.process_dt) as before,
        (select count(*) from b where a.id=b.id and b.Trans_dt between a.process_dt and a.process_dt+30) as after,
        coalesce((select sum(Amount) from b where a.id=b.id and b.Trans_dt between a.process_dt-30 and a.process_dt),0) as amount_before,
        coalesce((select sum(Amount) from b where a.id=b.id and b.Trans_dt between a.process_dt and a.process_dt+30),0) as amount_after
  from a;
quit;

Ksharp

Super Contributor
Posts: 647

Re: Count Transactions

thanks Ksharp.

inner join can get this result?

Super User
Posts: 9,662

Re: Count Transactions

I afraid you couldn't , the main problem is making PRESENT variable.

Super Contributor
Posts: 647

Re: Count Transactions

table b is a huge table.doing when id in (select distinct id from b) then 'Y' else 'N' end as present will be taking a long time,I believe.

Super User
Posts: 9,662

Re: Count Transactions

Did you try ?

or need another data step(hash table) to get this variable firstly .

or split it into lots of small tables .

Message was edited by: xia keshan

Super User
Super User
Posts: 6,497

Re: Count Transactions

Try this.  Not sure how well SAS will be able to optimize this.

Idea is to combine the two and set binary flag for whether the event is before or after.  Then you can use that in a SUM aggregate function in the outer query. 

You need to think about where to count the case when TRANS_DT = PROCESS_DT.  Is that in the 30 days before or the 30 days after?

I am not sure how you want to calculate the PRESENT flag.  You might need to move the data range checking out of the ON condition for the full join and into a WHERE condition for the outer query depending one what you want to do with cases where it is in HAVE1 but there are no dates in the 60 day window in HAVE2.

data have1;

input id process_dt : date9.;

format process_dt date9.;

cards;

11 06JUN2012

12 15AUG2012

13 01SEP2012

run;

data have2;

input id Trans_dt : mmddyy10. Amount;

format Trans_dt date9.;

cards;

11 05/15/2012 50

11 05/31/2012 30

11 06/30/2012 40

11 07/01/2012 70

12 08/31/2012 50

run;


proc sql;

create table want as

select id

     , max(c.present) as present

     , sum(c.before) as n_before

     , sum(c.after ) as n_after

     , coalesce(sum( c.amount * c.before ),0) as sum_before

     , coalesce(sum( c.amount * c.after  ),0) as sum_after

from (

  select coalesce(a.id,b.id) as id

       , (a.process_dt > b.trans_dt > .) as before

       , (. < a.process_dt < b.trans_dt) as after

       , case when (b.id is null) then 'N' else 'Y' end as present

       , b.amount

  from have1 a

       full join have2 b

       on a.id = b.id and (b.trans_dt between a.process_dt - 30 and a.process_dt + 30)

      ) c

group by 1

order by 1

;

quit;

proc print; run;

Ask a Question
Discussion stats
  • 8 replies
  • 635 views
  • 3 likes
  • 3 in conversation