DATA Step, Macro, Functions and more

SUM Based on Rolling Date

Accepted Solution Solved
Reply
New Contributor
Posts: 3
Accepted Solution

SUM Based on Rolling Date

Hello,

   I am trying to SUM transactions based on a 7 day period looking for accounts with $2,000 or more in transactions and I am having difficulty with the code and have not found anything through google that would solve my problem.

I have several months of transactional data by account and by date that I need to check if at any time in a 7 day period the account had $2,000 or more in transactions.  Accounts can have more than 1 transaction per day and transactions can occur on any day (including holidays).  The 7 day period can start/end on any day of the week.

I was hoping to use two a DO loop to determine the start date and end date however I have been unsucessful and would appreciate any guidance.

Below is a sample layout of the data I have:

Acct     Transaction     date

1234          50.00          01/01/2012

1234          1500.00      01/03/2012

4275          500.00        01/07/2012

There are no negative numbers in the transaction amount field.

Any help woudl be appreciated as I have spent the afternoon working on this without sucess.  I only need to output records that meet the requirements.

It almost feels like I need a do loop with a proc sql  inside a do loop but I have been unable to find anyway to do so.


Accepted Solutions
Solution
‎05-15-2012 07:22 PM
Super Contributor
Posts: 1,636

Re: SUM Based on Rolling Date

data have;

informat date mmddyy10.;

format date mmddyy10.;

input Acct     Transaction     date;

cards;

1234          50.00          01/01/2012

1234          20.00      01/02/2012

1234          50.00          01/02/2012

1234          1500.00      01/03/2012

1234          50.00          01/04/2012

1234          300.00      01/03/2012

1234          50.00          01/06/2012

1234          1500.00      01/07/2012

1234          50.00          01/08/2012

1234          600.00      01/09/2012

1234          50.00          01/10/2012

1234          100.00      01/10/2012

2234          50.00          01/01/2012

2234          10.00      01/02/2012

2234          5.00          01/02/2012

2234          10.00      01/03/2012

2234          50.00          01/04/2012

2234          700.00      01/03/2012

2234          50.00          01/06/2012

2234          200.00      01/07/2012

2234          50.00          01/08/2012

2234          100.00      01/09/2012

2234          50.00          01/10/2012

2234          1800.00      01/10/2012

;

proc sql;

  create table temp as select distinct acct, date from have;

  create table want as select a.acct,a.date,sum(Transaction) as total

    from temp a ,have b

  where a.acct=b.acct and 0<=a.date-b.date<7

      group by a.acct,a.date

        having total>2000;

quit;

View solution in original post


All Replies
Super User
Posts: 10,526

Re: SUM Based on Rolling Date

If I had to do this quickly I would try something like:

create interval for each record. If I did this right you now have the 7 day periods containg the transation.

data intervals;

     set <existing data>;

     do i = (-6) to 0 by 1;

          interval = catx(' - ',put((date+i),mmddyy10.), put((date +i+6),mmddyy10.));

          output;

     end;

run;

proc summary data= intervals nway;

     class account interval;

     var transaction;

     output out= transsummary (where=(transsum ge 2000)) sum=transsum;

run;

proc print data=transsummary;

     var account interval transsum;

run;

Solution
‎05-15-2012 07:22 PM
Super Contributor
Posts: 1,636

Re: SUM Based on Rolling Date

data have;

informat date mmddyy10.;

format date mmddyy10.;

input Acct     Transaction     date;

cards;

1234          50.00          01/01/2012

1234          20.00      01/02/2012

1234          50.00          01/02/2012

1234          1500.00      01/03/2012

1234          50.00          01/04/2012

1234          300.00      01/03/2012

1234          50.00          01/06/2012

1234          1500.00      01/07/2012

1234          50.00          01/08/2012

1234          600.00      01/09/2012

1234          50.00          01/10/2012

1234          100.00      01/10/2012

2234          50.00          01/01/2012

2234          10.00      01/02/2012

2234          5.00          01/02/2012

2234          10.00      01/03/2012

2234          50.00          01/04/2012

2234          700.00      01/03/2012

2234          50.00          01/06/2012

2234          200.00      01/07/2012

2234          50.00          01/08/2012

2234          100.00      01/09/2012

2234          50.00          01/10/2012

2234          1800.00      01/10/2012

;

proc sql;

  create table temp as select distinct acct, date from have;

  create table want as select a.acct,a.date,sum(Transaction) as total

    from temp a ,have b

  where a.acct=b.acct and 0<=a.date-b.date<7

      group by a.acct,a.date

        having total>2000;

quit;

Super User
Posts: 9,687

Re: SUM Based on Rolling Date

How about:

data have;
informat date mmddyy10.;
format date mmddyy10.;
input Acct     Transaction     date;
cards;
1234          50.00          01/01/2012
1234          20.00      01/02/2012
1234          50.00          01/02/2012
1234          1500.00      01/03/2012
1234          50.00          01/04/2012
1234          300.00      01/03/2012
1234          50.00          01/06/2012
1234          1500.00      01/07/2012
1234          50.00          01/08/2012
1234          600.00      01/09/2012
1234          50.00          01/10/2012
1234          100.00      01/10/2012
2234          50.00          01/01/2012
2234          10.00      01/02/2012
2234          5.00          01/02/2012
2234          10.00      01/03/2012
2234          50.00          01/04/2012
2234          700.00      01/03/2012
2234          50.00          01/06/2012
2234          200.00      01/07/2012
2234          50.00          01/08/2012
2234          100.00      01/09/2012
2234          50.00          01/10/2012
2234          1800.00      01/10/2012
;

proc sql;
create table want as 
 select a.*,(select sum(Transaction) as sum 
                        from have  as b
                         where b.acct=a.acct and b.date between a.date-6 and a.date ) as sum
    from have as a ;

create table want_acct as
 select distinct acct
  from want
   where sum gt 2000;
quit;


Ksharp

Super Contributor
Posts: 1,636

Re: SUM Based on Rolling Date

Hi Ksharp,

Our output datasets are different. I wonder which one is the one OP wants?

Super User
Posts: 9,687

Re: SUM Based on Rolling Date

I guess my code is right . Just Guess. Smiley Happy

Respected Advisor
Posts: 3,124

Re: SUM Based on Rolling Date

Well I guess this is more complex than it seems:

if you switch 'a' and 'b' in your date range comparison, you will get different results, I couldn't figure out why.

Super User
Posts: 9,687

Re: SUM Based on Rolling Date

You can't switch 'a' and 'b' in date range comparison.

b.date is rolling date , a.date is current obs's date .You can't exchange them.

Ksharp

Super User
Posts: 5,086

Re: SUM Based on Rolling Date

Brandon,

I have a different approach, which is a little more complex.  However, it has the potential to be much faster since it requires only one pass through the data.  The idea is this, all contained in one DATA step.  Compute the total transactions for each ACCT/DATE.  Track the last 7 such totals.  For each date, look through the 7 totals, and see which fall within the right time period.  Sum them up.  Here is what the code could look like (untested):

data want (keep = acct date);

   set have;

   by acct date;

   array _7amnts {0:6} amnt1-amnt7;

   array _7dates {0:6} date1-date7;

   retain amnt1-amnt7 date1-date7;

   if first.acct then do _i_=0 to 6;

      _7amnts{_i_}=.;

      _7dates{_i_}=.;

   end;

   if first.date then total = amount;

   else total + amount;

   if last.date;

   dates_tracked + 1;

   _7amnts{mod(dates_tracked,7)} = total;

   _7dates{mod(dates_tracked,7)} = date;

   last7_total=0;

   do _i_=0 to 6;

      if (date-6 < _7dates{_i_} <= date) then last7_total + _7amnts{_i_};

   end;

   if last7_total > 2000;

run;

Hope this helps (and hope it works)!

Super User
Posts: 5,086

Re: SUM Based on Rolling Date

OK, notice 2 things.  I have made a few minor edits, so this final program should be it ... subject to the fact that I used AMOUNT as the variable name instead of TRANSACTION.  Again, hope it helps.

Super Contributor
Posts: 1,636

Re: SUM Based on Rolling Date

Astounding,

I need to add proc sort; by acct date; to run your code.

data have;

informat date mmddyy10.;

format date mmddyy10.;

input Acct     amount    date;

cards;

1234          50.00          01/01/2012

1234          20.00      01/02/2012

1234          50.00          01/02/2012

1234          1500.00      01/03/2012

1234          50.00          01/04/2012

1234          300.00      01/03/2012

1234          50.00          01/06/2012

1234          1500.00      01/07/2012

1234          50.00          01/08/2012

1234          600.00      01/09/2012

1234          50.00          01/10/2012

1234          100.00      01/10/2012

2234          50.00          01/01/2012

2234          10.00      01/02/2012

2234          5.00          01/02/2012

2234          10.00      01/03/2012

2234          50.00          01/04/2012

2234          700.00      01/03/2012

2234          50.00          01/06/2012

2234          200.00      01/07/2012

2234          50.00          01/08/2012

2234          100.00      01/09/2012

2234          50.00          01/10/2012

2234          1800.00      01/10/2012

;

proc sort;

  by acct date;

data want (keep = acct date);

   set have;

   by acct date;

   array _7amnts {0:6} amnt1-amnt7;

   array _7dates {0:6} date1-date7;

   retain amnt1-amnt7 date1-date7;

   if first.acct then do _i_=0 to 6;

      _7amnts{_i_}=.;

      _7dates{_i_}=.;

   end;

   if first.date then total = amount;

   else total + amount;

   if last.date;

   dates_tracked + 1;

   _7amnts{mod(dates_tracked,7)} = total;

   _7dates{mod(dates_tracked,7)} = date;

   last7_total=0;

   do _i_=0 to 6;

      if (date-6 < _7dates{_i_} <= date) then last7_total + _7amnts{_i_};

   end;

   if last7_total > 2000;

run;

proc print;run;

Super User
Posts: 5,086

Re: SUM Based on Rolling Date

Linlin,

Yes, the data needs to be in order.  I assumed it was, since the original post said the data was "transactional data by account and by date".  But if sorting is required, my solution might take longer to run.

Thanks!

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 11 replies
  • 1250 views
  • 3 likes
  • 6 in conversation