help validating duplicates?

Occasional Contributor
Posts: 11

help validating duplicates?

Dear folks,

I am working on a project on enhanced diligence on credit card users for a financial services company. I am trying to test for duplicates in credit card usage in different locations in Europe and trying to arrive at a pattern. Most commonly my team has only repeatedly managed to do simple frequency(proc freq) test in our reports for duplicate testing, Is there any other ways you guys could suggest me to test better interrelating variables from identity, location, transaction usage pattern?

An,example, to check if one's credit card has been used at different locations within very short time interval.

i. My objective is not to build a statistical model right away, its is merely to carry out enhanced checks to understand the scenario

I have tried univariate and bivariate explanatory analysis however i have not been successful enough to translate any of such descriptive s to meaningful understanding.

Your suggestions will be most appreciated. Thanks. Merry Christmas.



Frequent Contributor
Posts: 97

Re: help validating duplicates?

Posted in reply to Nav_Sweeney


u can try in PROC SQL or PROC SORT

i guess PROC SQL is best ...

·          proc sql;

    create view _have as

            select Ident,country,count(*)

              from have

                           group by  Ident,country;




Respected Advisor
Posts: 3,167

Re: help validating duplicates?

Posted in reply to Nav_Sweeney


IMHO, your task is more of rules setting than programming. Once you have your rules set, then the following programming is just mechanical, and we can definitely help you with that. First of all, it is all depending on your purpose, what you want to achieve.1) Simply identify spending patterns or 2) flag possible fraud.

For instance, if your objective is to flag possible fraud, to start, I would recommend the following:

1. To differentiate Internet and local transactions. For Internet orders, flag those orders with shipping address different with billing address, or just mailboxes.

2. For local transactions, you will need additional map database to determine the distance between two transactions, and the reasonable travel time by flight, train or driving. You would need many rules here, for example, if the distance is too large to drive, or the majority (you need to define that) of travel between two places are done by certain way , then an abnormality could be identified by assigning different weights. 

3. You probably also need to factor in time of the year as well. Life changing events like graduation, wedding should also be considered. And comparing to history records would help very much.

And there are many many other items I fail to include. So what you need is to do a in-depth research of setting up your rules, there is no unusual events if there is no rules.

Just my 2cents,


Frequent Contributor
Posts: 86

Re: help validating duplicates?

Posted in reply to Nav_Sweeney

Basically, these will be data sanity checks before they are further processed and used as input in any kind of model.

The starting point would be to check if the same user  account has been opened more than 1 time in different countries of Europe by taking multiple parameters and flagging the accounts.

Then you can look at their spends pattern and payments posted back after the credit card usage for any suspicion of fraud usage.

You can also include Time variables and check the card usage of suspicious accounts.

For Transaction usage Pattern, you can summarize the accounts balance first taking big periods say a quarter1, quarter2 and so on and then further dig down to a quarter based upon what you find.

Once you set your logic, these can be easily applied using SAS.

Respected Advisor
Posts: 4,741

Re: help validating duplicates?

Posted in reply to Nav_Sweeney

Going forward you might want to investigate what the SAS Financial Crimes solutions could do for you. I was involved in an implementation of the Anti-Money Laundering solution and it added quite a bit of value for the customer.

One of the benefits is that the solutions come already with a set of "out-of-the box" rules (which can be amended and extended). So it's not only about an application but also about knowledge and processes which comes as part of the solutions.

Anti-Money Laundering | SAS

Ask a Question
Discussion stats
  • 4 replies
  • 1 like
  • 5 in conversation