DATA Step, Macro, Functions and more

Transaction count & sum across days using arrays

Reply
Occasional Contributor SM3
Occasional Contributor
Posts: 9

Transaction count & sum across days using arrays

Hi.I have data as below. I want to calculate sum & count of transactions across days based on condition "transaction amount between 0 to 4000" by cust and act.I have data from sep1st. 2011 to sep1st 2012

Cust id actid trasdate transamt
100 1 9/30/2011 1500
100 1 11/4/2011 2000
100 1 11/4/2011 5000
100 1 11/4/2011 1000
100 1 11/5/2011 4000
100 1 11/5/2011 2500

I want result as

custd actid transdate sumtransamt cnt
100 1 9/30/2011 1500 1
100 1 11/4/2011 3000 3
100 1 11/5/2011 2500 1

Appreciate ur help.

Thanks
Super Contributor
Posts: 252

Re: Transaction count & sum across days using arrays

This is pretty close to what you want, I think:

data trancount;
infile cards;
attrib custid actid length=4
attrib transdate length=4 informat=mmddyy10. format=yymmdd10.;
attrib transamt length=6 format=comma14.;
input custid
      actid
      transdate
      transamt;
cards;
100 1 9/30/2011 1500
100 1 11/4/2011 2000
100 1 11/4/2011 5000
100 1 11/4/2011 1000
100 1 11/5/2011 4000
100 1 11/5/2011 2500
;
run;

proc sql;
create table result as
   select custid,
          actid,
          transdate,
          sum(transamt) as sumtransamt format=comma14.,
          count(*) as cnt length=4
     from trancount
    where 0 le transamt lt 4000 /* It appears you don't actually want 4,000 inclusive */
    group by custid,
             actid,
             transdate;
quit;

There's so many ways to skin a cat with SAS; the second step could easily be replaced with a sort and a data step. But as long as it remains as simple as this, it should scale well.

Occasional Contributor SM3
Occasional Contributor
Posts: 9

Re: Transaction count & sum across days using arrays

Thanks Laurie.But I want to do this by using arrays. Can u help?
Occasional Contributor SM3
Occasional Contributor
Posts: 9

Re: Transaction count & sum across days using arrays

Array because the sum of transactions and count should be done by taking 5 days recurring window i,e sep1-5 then sept2-6 like that it goes on.
Super User
Posts: 19,772

Re: Transaction count & sum across days using arrays

You didn't mention moving statistics in your initial post. Given your question, @LaurieF solution is correct. 

 

Here's a SAS note regarding using an array to calculating moving statistics. 

Note that you cannot be missing dates in your series for an array method to work correctly, ie you need a record for every day. 

 

http://support.sas.com/kb/41/380.html

Occasional Contributor SM3
Occasional Contributor
Posts: 9

Re: Transaction count & sum across days using arrays

Yes Reeza, I agree LaurieF's solution is corrrct if we don't have moving statistics thing. Thanks
Trusted Advisor
Posts: 1,018

Re: Transaction count & sum across days using arrays

[ Edited ]

Oops, ignore everything except my first paragraph.  You apparently want discrete (not rolling) 5-day windows, starting with 9/1/2011.

 

Your WANT dataset has 3 transactions for 11/4, but only 1 for 11/5.  Is that consistent?  The resulting CNT variable represents only qualifying transactions (<4000) for 9/30 and 11/5, but counts ALL transactions for 11/4.

 

You asked for an array solution, which I take means you want a data step approach.  However, while the array solution might be good for rolling windows of fixed size, or for scanning over several amounts within each transaction, I don't think it makes sense here.  Here's a data step program the generates the same SUMTRANSAMT value as you have, and generates a corresponding count of eligible transaction

 

data want;

  set trancount;

  by custid actid transdate ;

  if first.transdate then do;

    cnt=0; sumtransamt=.;

  end;

  if 0<=transamt<4000 then do;

    cnt+1;

    sumtransamt+transamt;

  end;

  if last.transdate;

  drop transamt;

run;

 

I'm not saying it's impossible to use arrays, but at the least, you'd have to know in advance the maximum number of transactions that might occur on a given date, so that you'd declare an array that is big enough to hold every transaction for the day. 

Trusted Advisor
Posts: 1,018

Re: Transaction count & sum across days using arrays

Here's a solution that applies to 5-day date periods, using 01sep2011 as the reference date, but using arrays still seems sub-optimal.  It uses the same basic logic, and assumes your data are sorted by custid actid transdate:

 

data need/view=need;

  set trancount;

   date_group=floor((transdate-'01sep2011'd)/5);

run;

data want;

  set need;

  by custid actid date_group ;

  if first.date_group then do;

     cnt=0; sumtransamt=.;

  end;

  if 0<=transamt<4000 then do;

    cnt+1;

    sumtransamt+transamt;

  end;

  if last.date_group;

  drop date_group transamt;

run;

 

 

 

Super User
Posts: 10,023

Re: Transaction count & sum across days using arrays


data trancount;
infile cards;
attrib custid actid length=4;
attrib transdate length=4 informat=mmddyy10. format=yymmdd10.;
attrib transamt length=6 format=comma14.;
input custid
      actid
      transdate
      transamt;
cards;
100 1 9/30/2011 1500
100 1 11/4/2011 2000
100 1 11/4/2011 5000
100 1 11/4/2011 1000
100 1 11/5/2011 4000
100 1 11/5/2011 2500
;
run;
data want;
n=0; sumtransamt =0;
 do until(last.transdate);
  set trancount;
  by custid actid transdate;
  if 0<4000 then do;
   n+1; sumtransamt+transamt;
  end;
 end;
 drop transamt;
run;

Occasional Contributor SM3
Occasional Contributor
Posts: 9

Re: Transaction count & sum across days using arrays

Guys I really appreciate all of ur help & i ll use ur suggestions and apply on my data. Want to clarify one thing regarding the data. The data i gave is just for one cust id and i have several such custid in my data with transaction date from 1sep 2011-30sep2012. Also there are instances where customers don't have transactions everyday. If there are multiple transactions in a  day i need to sum it up.First i need to identify transaction between 0-4k by using a rolling 5 day window ex: 1st spt to 5thsep, then 2nd sept to 6sep and so on and then i need to sum the transactions by cust id & acct id. Its little complex to assemble and thanks all for helping me out.

Super User
Posts: 19,772

Re: Transaction count & sum across days using arrays


SM3 wrote:

Guys I really appreciate all of ur help & i ll use ur suggestions and apply on my data. Want to clarify one thing regarding the data. The data i gave is just for one cust id and i have several such custid in my data with transaction date from 1sep 2011-30sep2012. Also there are instances where customers don't have transactions everyday. If there are multiple transactions in a  day i need to sum it up.First i need to identify transaction between 0-4k by using a rolling 5 day window ex: 1st spt to 5thsep, then 2nd sept to 6sep and so on and then i need to sum the transactions by cust id & acct id. Its little complex to assemble and thanks all for helping me out.


 

This shouldn't be difficult and is within the first SAS programming course, break it down into smaller steps and you'll be fine. BY group processing will handle multiple ID and accounts without any difficulty. Please note, that I'm not trying to trivialize what you're trying to do, only trying to say there isn't anything overly complex with what you've specified. 

 

Your steps appear out of order, but perhaps that's from what you're posting here. You're going to need to summarize per day BEFORE you calculate a rolling 5 day window, otherwise your filter won't be what you're expecting. You also need to define your 5 day definition carefully. Is it always 5 days, or 5 business days or 5 transaction days? September is fine, but imagine any 5 day period that includes Dec 24th and 26th.

 

Make sure that the data you provide shows the complexities of your data, but not overcomplicating it. Including a few ID/accounts is a good start. 

 

Respected Advisor
Posts: 4,173

Re: Transaction count & sum across days using arrays

@SM3

I know it's sometimes harder than if looks like from outside and I get it wrong more often than not when asking a question BUT: It's really worth if you take the time and provide fully representative sample data (a data step creating such data) and then show us how the expected result should look like. This avoids a lot of missunderstandings plus people giving their time to you don't waste their effort for the wrong problem.

Ask a Question
Discussion stats
  • 11 replies
  • 374 views
  • 1 like
  • 6 in conversation