- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I have a list of transactions, each with an associated date and account ID. Does anyone know of a simple way of determining if an account has accumulated a threshold number of dollars in a date range (ex. $X in any twelve-month period over the past ten years)?
For instance, I might need to know if an account gave $5000 in any twelve-month period in the past ten years. It sounds very simple, but I cannot find a way to achieve this without either creating a table for each day of the ten-year period or for creating a variable for each transaction in an account's transactional history (which would also yield a very large number of variables).
Thanks,
Mike
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data have;
input tx_id tx_date account_id amt;
informat tx_date date9.;
format tx_date date9.;
cards;
1 01jan2012 101 1000
2 02jan2012 101 1000
3 03jan2012 101 1000
4 04jan2012 101 1000
5 05jan2012 101 1000
6 06jan2012 101 1000
7 07jan2012 101 1000
8 08jan2012 101 1000
9 09jan2012 101 1000
10 10jan2012 101 1000
11 01jan2012 102 5000
12 01jan2012 103 100
;
run;
proc sql;
select distinct account_id from
(select
t1.account_id
,t1.tx_date
from
work.have t1
left outer join work.have t2
on t1.account_id=t2.account_id
and t1.tx_date <= t2.tx_date
and intnx('year',t1.tx_date,1)>=t2.tx_date
group by t1.account_id, t1.tx_date
having sum(t2.amt)>=5000) ;
quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If the dates are SAS date values this may not be too difficult.
Proc format;
value tdate
/* this is where the work goes on by specifying date ranges of interest*/
'01APR2000'd - '31MAR2001'd
'01APR2001'd - '31MAR2002'd
'01APR2002'd - '31MAR2003'd
'01APR2003'd - '31MAR2004'd
'01APR2004'd - '31MAR2005'd
'01APR2005'd - '31MAR2006'd
'01APR2006'd - '31MAR2007'd /* continue as needed to span your time range*/
;
run;
I would probably build a variety of permanent formats maybe 12 such as tdateJan tdateFeb tdateMar tdateApr (above)
and then change the format in the proc summary below.
proc summary data=yourdata nway;
class AccountId TransActionDate;
format TransactionDate Tdate.;
var TransactionValue;
output out=TransactionSum (where=(TransactionSum ge 5000)) sum=TransactionSum;
run
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Mike,
I couldn't write it without a sample dataset. And, are you looking for individual transactions during the twelve month periods or a sum of the transactions within a month, or something else altogether?
If it is the sum of transactions, I would use proc summary to create a file that has the monthly sums for each id for each month of the ten-year period i.e., include a format that converted dates to monyy7..
Then, I'd use a datastep that only kept the ID number with (1) a 120 cell array to capture all of the months; (2) a loop that went through each of the possible 12 month periods and, in that loop, build a 12 cell array to capture the 12 months and use the sum function to determine if the value was ge 5000.
If that criterion is reached, output the ID number, exit the loop and begin the process over with the next id's information..
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data have;
input tx_id tx_date account_id amt;
informat tx_date date9.;
format tx_date date9.;
cards;
1 01jan2012 101 1000
2 02jan2012 101 1000
3 03jan2012 101 1000
4 04jan2012 101 1000
5 05jan2012 101 1000
6 06jan2012 101 1000
7 07jan2012 101 1000
8 08jan2012 101 1000
9 09jan2012 101 1000
10 10jan2012 101 1000
11 01jan2012 102 5000
12 01jan2012 103 100
;
run;
proc sql;
select distinct account_id from
(select
t1.account_id
,t1.tx_date
from
work.have t1
left outer join work.have t2
on t1.account_id=t2.account_id
and t1.tx_date <= t2.tx_date
and intnx('year',t1.tx_date,1)>=t2.tx_date
group by t1.account_id, t1.tx_date
having sum(t2.amt)>=5000) ;
quit;