DATA Step, Macro, Functions and more

Credit card transaction data - transaction velocity

Reply
Occasional Contributor
Posts: 6

Credit card transaction data - transaction velocity

Hi all,

 

I have a large amount of transaction data and I'm wanting to pull transactions where a count of previous transactions have taken place prior to a given transaction in a specified amount of time.  For example, I might want to see all transactions where there were two transactions that took place at the same merchant in the previous hour on the same card.  

 

I have account numbers and the datetime of the authorizations to work with along with merchant name, merchant category, and a bunch of variables to specify the activity I'm looking for.  I have other systems I can use to pull such activity but it's slow and clunky.  I've been playing around with different code but can't seem to come up with something that works.  

 

Any suggestions on how I can accomplish this?  No need to provide code examples if it's easier to just explain in words how I can do this.

 

If further explanation of what I'm trying to accomplish is needed just let me know and I'll be happy to provide.

 

Thanks!

Super User
Posts: 5,256

Re: Credit card transaction data - transaction velocity

Sounds like you have two issues:

  1. How to code the logic. Please provide what you tried so far. Extract a reasonable amount of transactions to you local SAS environment, so speed up program development. Or create your own sample data.
  2. The speed to interact with the source system is too slow. This is probably more of an architectural issue. If you have the resources, you should extract all data in bulk regularly, in a data warehouse like initiative, then you should be able to apply your logic on high speed SAS data sets.
Data never sleeps
Occasional Contributor
Posts: 6

Re: Credit card transaction data - transaction velocity

Coding the logic is the issue.

 

I've tried using the lag function and intck.  I've thought about using SQL and joining the table back on itself and doing intck but this doesn't seem efficient at all considering I'm working with tens of thousands of rows of data.  

Super User
Posts: 5,256

Re: Credit card transaction data - transaction velocity

Lag or Retain should do the trick. What didn't work?
Join does not sound like an ideal solution.
Data never sleeps
Occasional Contributor
Posts: 6

Re: Credit card transaction data - transaction velocity

I can get the time between transactions but run into trouble when wanting to actually pull the transactions where there were a set number of previous transactions in a given time.

Super User
Posts: 10,500

Re: Credit card transaction data - transaction velocity

When you say "a given time" to you mean an explicit time such as 1:00:00 to 1:59:59 or an interval such as 60 minutes?

The later is going to have issues because you could well have overlapping intervals.

 

See this example code:

data example;
   informat acct $5. transdt datetime.;
   format transdt datetime.;
   input acct transdt;
datalines;
12345 01JAN2016:10:05:00
12345 01JAN2016:10:35:00
12345 01JAN2016:11:00:00
12345 01JAN2016:11:25:00
12345 01JAN2016:11:55:00
;
RUN;

data junk;
   set example;
   int1 = intck('minute',lag1(transdt),transdt);
   int2 = intck('minute',lag2(transdt),transdt);
   int3 = intck('minute',lag3(transdt),transdt);
   int4 = intck('minute',lag4(transdt),transdt);
run;

Look at the intervals (the int variables). The 4th record is within an hour of the 2nd record and the 3rd record is within an hour of the 1st. The 4th is within an hour of 3rd but not 1st. So which interval does the 3rd record fit? And the longer your interval the more likely you have some overlapping intervals. If you want a single 'interval' per record you will have to pick some rules. Or calculate a bunch of variables (array anyone) of Interval_to_previous_transaction and which_previous_transaction and possibly what ever you are doing with the values of the transactions).

 

 

Or possibly this might be an IML or ETS but not sure what would fit.

Occasional Contributor
Posts: 6

Re: Credit card transaction data - transaction velocity

I'm referring to an interval such as 60 minutes.  Not an explicit time.  

 

Thanks,

 

Josh

Super User
Posts: 10,500

Re: Credit card transaction data - transaction velocity

And your rule(s) for handling overlapping intervals are what?

Super User
Posts: 3,105

Re: Credit card transaction data - transaction velocity

I've been involved with similar exercises like this in the past and I suspect you may be wanting to analyse credit card transactions for suspicious or fraudulent behaviour.

 

The best approach with this might be to just preprocess the transactions and total up the number of transactions by merchant and card number for each quarter hour slot (24 times 4 per day). Then go through the data again (with the quarter hour data joined on to it), take the quarter hour slot the transaction belongs to plus the three preceding ones to work out how many transactions there have been in the last hour. 

 

Occasional Contributor
Posts: 6

Re: Credit card transaction data - transaction velocity

Having trouble wrapping my head around this idea.  

 

We have a couple people in other areas here I'm going to reach out to see if they can walk me through what I'm trying to do.

 

Thanks for the suggestions.

 

My SAS skills are pretty limited at this point.  I've mostly relied on SQL in the past.

Ask a Question
Discussion stats
  • 9 replies
  • 384 views
  • 0 likes
  • 4 in conversation