This is pretty close to what you want, I think:
data trancount;
infile cards;
attrib custid actid length=4
attrib transdate length=4 informat=mmddyy10. format=yymmdd10.;
attrib transamt length=6 format=comma14.;
input custid
actid
transdate
transamt;
cards;
100 1 9/30/2011 1500
100 1 11/4/2011 2000
100 1 11/4/2011 5000
100 1 11/4/2011 1000
100 1 11/5/2011 4000
100 1 11/5/2011 2500
;
run;
proc sql;
create table result as
select custid,
actid,
transdate,
sum(transamt) as sumtransamt format=comma14.,
count(*) as cnt length=4
from trancount
where 0 le transamt lt 4000 /* It appears you don't actually want 4,000 inclusive */
group by custid,
actid,
transdate;
quit;
There's so many ways to skin a cat with SAS; the second step could easily be replaced with a sort and a data step. But as long as it remains as simple as this, it should scale well.
You didn't mention moving statistics in your initial post. Given your question, @LaurieF solution is correct.
Here's a SAS note regarding using an array to calculating moving statistics.
Note that you cannot be missing dates in your series for an array method to work correctly, ie you need a record for every day.
Oops, ignore everything except my first paragraph. You apparently want discrete (not rolling) 5-day windows, starting with 9/1/2011.
Your WANT dataset has 3 transactions for 11/4, but only 1 for 11/5. Is that consistent? The resulting CNT variable represents only qualifying transactions (<4000) for 9/30 and 11/5, but counts ALL transactions for 11/4.
You asked for an array solution, which I take means you want a data step approach. However, while the array solution might be good for rolling windows of fixed size, or for scanning over several amounts within each transaction, I don't think it makes sense here. Here's a data step program the generates the same SUMTRANSAMT value as you have, and generates a corresponding count of eligible transaction
data want;
set trancount;
by custid actid transdate ;
if first.transdate then do;
cnt=0; sumtransamt=.;
end;
if 0<=transamt<4000 then do;
cnt+1;
sumtransamt+transamt;
end;
if last.transdate;
drop transamt;
run;
I'm not saying it's impossible to use arrays, but at the least, you'd have to know in advance the maximum number of transactions that might occur on a given date, so that you'd declare an array that is big enough to hold every transaction for the day.
Here's a solution that applies to 5-day date periods, using 01sep2011 as the reference date, but using arrays still seems sub-optimal. It uses the same basic logic, and assumes your data are sorted by custid actid transdate:
data need/view=need;
set trancount;
date_group=floor((transdate-'01sep2011'd)/5);
run;
data want;
set need;
by custid actid date_group ;
if first.date_group then do;
cnt=0; sumtransamt=.;
end;
if 0<=transamt<4000 then do;
cnt+1;
sumtransamt+transamt;
end;
if last.date_group;
drop date_group transamt;
run;
data trancount; infile cards; attrib custid actid length=4; attrib transdate length=4 informat=mmddyy10. format=yymmdd10.; attrib transamt length=6 format=comma14.; input custid actid transdate transamt; cards; 100 1 9/30/2011 1500 100 1 11/4/2011 2000 100 1 11/4/2011 5000 100 1 11/4/2011 1000 100 1 11/5/2011 4000 100 1 11/5/2011 2500 ; run; data want; n=0; sumtransamt =0; do until(last.transdate); set trancount; by custid actid transdate; if 0<4000 then do; n+1; sumtransamt+transamt; end; end; drop transamt; run;
Guys I really appreciate all of ur help & i ll use ur suggestions and apply on my data. Want to clarify one thing regarding the data. The data i gave is just for one cust id and i have several such custid in my data with transaction date from 1sep 2011-30sep2012. Also there are instances where customers don't have transactions everyday. If there are multiple transactions in a day i need to sum it up.First i need to identify transaction between 0-4k by using a rolling 5 day window ex: 1st spt to 5thsep, then 2nd sept to 6sep and so on and then i need to sum the transactions by cust id & acct id. Its little complex to assemble and thanks all for helping me out.
@SM3 wrote:
Guys I really appreciate all of ur help & i ll use ur suggestions and apply on my data. Want to clarify one thing regarding the data. The data i gave is just for one cust id and i have several such custid in my data with transaction date from 1sep 2011-30sep2012. Also there are instances where customers don't have transactions everyday. If there are multiple transactions in a day i need to sum it up.First i need to identify transaction between 0-4k by using a rolling 5 day window ex: 1st spt to 5thsep, then 2nd sept to 6sep and so on and then i need to sum the transactions by cust id & acct id. Its little complex to assemble and thanks all for helping me out.
This shouldn't be difficult and is within the first SAS programming course, break it down into smaller steps and you'll be fine. BY group processing will handle multiple ID and accounts without any difficulty. Please note, that I'm not trying to trivialize what you're trying to do, only trying to say there isn't anything overly complex with what you've specified.
Your steps appear out of order, but perhaps that's from what you're posting here. You're going to need to summarize per day BEFORE you calculate a rolling 5 day window, otherwise your filter won't be what you're expecting. You also need to define your 5 day definition carefully. Is it always 5 days, or 5 business days or 5 transaction days? September is fine, but imagine any 5 day period that includes Dec 24th and 26th.
Make sure that the data you provide shows the complexities of your data, but not overcomplicating it. Including a few ID/accounts is a good start.
I know it's sometimes harder than if looks like from outside and I get it wrong more often than not when asking a question BUT: It's really worth if you take the time and provide fully representative sample data (a data step creating such data) and then show us how the expected result should look like. This avoids a lot of missunderstandings plus people giving their time to you don't waste their effort for the wrong problem.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.