BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mcopple
Calcite | Level 5

I have a list of transactions, each with an associated date and account ID. Does anyone know of a simple way of determining if an account has accumulated a threshold number of dollars in a date range (ex. $X in any twelve-month period over the past ten years)?

For instance, I might need to know if an account gave $5000 in any twelve-month period in the past ten years. It sounds very simple, but I cannot find a way to achieve this without either creating a table for each day of the ten-year period or for creating a variable for each transaction in an account's transactional history (which would also yield a very large number of variables).

Thanks,

Mike

1 ACCEPTED SOLUTION

Accepted Solutions
DBailey
Lapis Lazuli | Level 10

data have;

input tx_id tx_date account_id amt;

informat tx_date date9.;

format tx_date date9.;

cards;

1 01jan2012 101 1000

2 02jan2012 101 1000

3 03jan2012 101 1000

4 04jan2012 101 1000

5 05jan2012 101 1000

6 06jan2012 101 1000

7 07jan2012 101 1000

8 08jan2012 101 1000

9 09jan2012 101 1000

10 10jan2012 101 1000

11 01jan2012 102 5000

12 01jan2012 103 100

;

run;

proc sql;

select distinct account_id from

(select

    t1.account_id

    ,t1.tx_date   

from

    work.have t1

    left outer join work.have t2

        on t1.account_id=t2.account_id

            and t1.tx_date <= t2.tx_date

            and intnx('year',t1.tx_date,1)>=t2.tx_date

group by t1.account_id, t1.tx_date

having sum(t2.amt)>=5000) ;

quit;

View solution in original post

3 REPLIES 3
ballardw
Super User

If the dates are SAS date values this may not be too difficult.

Proc format;

value tdate

/* this is where the work goes on by specifying date ranges of interest*/

'01APR2000'd   -    '31MAR2001'd

'01APR2001'd   -    '31MAR2002'd

'01APR2002'd   -    '31MAR2003'd

'01APR2003'd   -    '31MAR2004'd

'01APR2004'd   -    '31MAR2005'd

'01APR2005'd   -    '31MAR2006'd

'01APR2006'd   -    '31MAR2007'd  /* continue as needed to span your time range*/

;

run;

I would probably build a variety of permanent formats maybe 12 such as tdateJan tdateFeb tdateMar tdateApr (above)

and then change the format in the proc summary below.

proc summary data=yourdata nway;

    

     class AccountId TransActionDate;

     format TransactionDate Tdate.;

     var TransactionValue;

     output out=TransactionSum (where=(TransactionSum ge 5000)) sum=TransactionSum;

run

art297
Opal | Level 21

Mike,

I couldn't write it without a sample dataset.  And, are you looking for individual transactions during the twelve month periods or a sum of the transactions within a month, or something else altogether?

If it is the sum of transactions, I would use proc summary to create a file that has the monthly sums for each id for each month of the ten-year period i.e., include a format that converted dates to monyy7..

Then, I'd use a datastep that only kept the ID number with (1) a 120 cell array to capture all of the months; (2) a loop that went through each of the possible 12 month periods and, in that loop, build a 12 cell array to capture the 12 months and use the sum function to determine if the value was ge 5000.

If that criterion is reached, output the ID number, exit the loop and begin the process over with the next id's information..

DBailey
Lapis Lazuli | Level 10

data have;

input tx_id tx_date account_id amt;

informat tx_date date9.;

format tx_date date9.;

cards;

1 01jan2012 101 1000

2 02jan2012 101 1000

3 03jan2012 101 1000

4 04jan2012 101 1000

5 05jan2012 101 1000

6 06jan2012 101 1000

7 07jan2012 101 1000

8 08jan2012 101 1000

9 09jan2012 101 1000

10 10jan2012 101 1000

11 01jan2012 102 5000

12 01jan2012 103 100

;

run;

proc sql;

select distinct account_id from

(select

    t1.account_id

    ,t1.tx_date   

from

    work.have t1

    left outer join work.have t2

        on t1.account_id=t2.account_id

            and t1.tx_date <= t2.tx_date

            and intnx('year',t1.tx_date,1)>=t2.tx_date

group by t1.account_id, t1.tx_date

having sum(t2.amt)>=5000) ;

quit;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 1586 views
  • 6 likes
  • 4 in conversation