Table A has Id and Process_dt. Table B has Id,Trans_dt,Amount.
How to get number of transactions for each id whose transactions happened in last 30 days of the process_dt and next 30 days of the process_dt and their sum of amount as well.
id present no.of trans_under_30 no.of_trans_next_30 total_amount
Column present gives info 'Y' 'N' ,if the id is present in table A or not . Above are columns in the final table required.
SQL can do that. Showing your data and the final output will explain your question more clear.
Ksharp
Table A
id process_dt
11 06JUN2012
12 15AUG2012
13 01SEP2012
Table B
id Trans_dt Amount
11 05/15/2012 50
11 05/31/2012 30
11 06/30/012 40
11 07/01/2012 70
12 08/31/2012 50
final table:
id present trans_before_30_days trans_next_30days Amount_before amount_after
11 Y 2 2 80 110
12 Y 0 1 0 50
13 N 0 0 0 0
OK.
data a; input id process_dt : date9.; format process_dt date9.; cards; 11 06JUN2012 12 15AUG2012 13 01SEP2012 ; run; data b; input id Trans_dt : mmddyy10. Amount; format Trans date9.; cards; 11 05/15/2012 50 11 05/31/2012 30 11 06/30/012 40 11 07/01/2012 70 12 08/31/2012 50 ; run; proc sql; create table want as select id,case when id in (select distinct id from b) then 'Y' else 'N' end as present, (select count(*) from b where a.id=b.id and b.Trans_dt between a.process_dt-30 and a.process_dt) as before, (select count(*) from b where a.id=b.id and b.Trans_dt between a.process_dt and a.process_dt+30) as after, coalesce((select sum(Amount) from b where a.id=b.id and b.Trans_dt between a.process_dt-30 and a.process_dt),0) as amount_before, coalesce((select sum(Amount) from b where a.id=b.id and b.Trans_dt between a.process_dt and a.process_dt+30),0) as amount_after from a; quit;
Ksharp
thanks Ksharp.
inner join can get this result?
I afraid you couldn't , the main problem is making PRESENT variable.
table b is a huge table.doing when id in (select distinct id from b) then 'Y' else 'N' end as present will be taking a long time,I believe.
Did you try ?
or need another data step(hash table) to get this variable firstly .
or split it into lots of small tables .
Message was edited by: xia keshan
Try this. Not sure how well SAS will be able to optimize this.
Idea is to combine the two and set binary flag for whether the event is before or after. Then you can use that in a SUM aggregate function in the outer query.
You need to think about where to count the case when TRANS_DT = PROCESS_DT. Is that in the 30 days before or the 30 days after?
I am not sure how you want to calculate the PRESENT flag. You might need to move the data range checking out of the ON condition for the full join and into a WHERE condition for the outer query depending one what you want to do with cases where it is in HAVE1 but there are no dates in the 60 day window in HAVE2.
data have1;
input id process_dt : date9.;
format process_dt date9.;
cards;
11 06JUN2012
12 15AUG2012
13 01SEP2012
run;
data have2;
input id Trans_dt : mmddyy10. Amount;
format Trans_dt date9.;
cards;
11 05/15/2012 50
11 05/31/2012 30
11 06/30/2012 40
11 07/01/2012 70
12 08/31/2012 50
run;
proc sql;
create table want as
select id
, max(c.present) as present
, sum(c.before) as n_before
, sum(c.after ) as n_after
, coalesce(sum( c.amount * c.before ),0) as sum_before
, coalesce(sum( c.amount * c.after ),0) as sum_after
from (
select coalesce(a.id,b.id) as id
, (a.process_dt > b.trans_dt > .) as before
, (. < a.process_dt < b.trans_dt) as after
, case when (b.id is null) then 'N' else 'Y' end as present
, b.amount
from have1 a
full join have2 b
on a.id = b.id and (b.trans_dt between a.process_dt - 30 and a.process_dt + 30)
) c
group by 1
order by 1
;
quit;
proc print; run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.