BookmarkSubscribeRSS Feed

Anti-Fraud In Graph Data

Started ‎03-30-2021 by
Modified ‎10-20-2022 by
Views 1,878
Team Name The Dark Knigh Of DC
Track Banking & Insurance
Use Case Anti-fraud applications in the graph data
Technology Maching Learning;Graph Algorithm;Graph Embedding;Neural Network etc...
Region APAC
Team lead @xupw 
Team members @Liujql  @hailunwang  

 

Here is more detalis for our using data:

The original dataset is extracted from Aminer(https://www.aminer.cn/), using articles as nodes, their titles' word embeddings as node features, and the citation relationships as adjacency. The dataset contains 659,574 nodes and 2,878,577 links, with each node equipped with a 100-dimensional node feature. The labels of the first 609,574 nodes (indexed 0..609,574) are released for training, while the labels of the rest 50,000 nodes are for evaluation. 

 

We modify the application background of these data. Assuming that these data are obtained from the bank, these nodes are assumed to be bank accounts, and the personal information of the accounts (including age, region, income, AUM, etc.) are used as node features, node labels are assumed to be the labels of the customer. Use these public data to simulate actual financial scenarios.

 

Each article in the AMiner data set serves as a node, and there are 18 categories of nodes, representing the research field of each article. We build GCN and GAT models to perform node classification tasks on the nodes in the above data. In this multi-category node classification task, the training efficiency of GCN is significantly higher than that of GAT. In this 18-category classification task, the GCN algorithm can achieve a classification accuracy of 43.76% after 500 trainings (maybe because an article belongs to multiple research fields, resulting in a lower final prediction accuracy).

 

Liujql_0-1617852535257.png

 

In the actual anti-fraud scenario, we can transplant the method and process of this modeling, take the customer as each node of our graph data, and set the label of the node as fraudulent customer and normal customer according to their personal fraud attributes, through GCN Algorithms such as graph neural network such as, GAT, etc., classify each node. Since the anti-fraud model task is a binary classification task and the uniqueness of fraudulent customers and normal customers, the accuracy of the final anti-fraud model will be greatly improved compared with the existing model results. In fact, the graph neural network has begun to be used in various fields of anti-fraud, including application anti-fraud, transaction anti-fraud, anti-money laundering, and financial risk control.

 

 

 

 

 

Comments

can you share video of what you did in the SAS system as part of the hackathon? your videos above introduce the concept but am not able to see how this was realised in SAS or Open source models.

It is difficult to describe all the details in the presentation video. Here are the steps. 1) A broaden range of machining sounds from our workshop and on-line (120 pieces) were first collected. 2) We segmented and labelled these machine sound data (500 each) in 4 categories of machining events (i.e. normalcutting. Heavycutting, Crash and ToolBreak). 3) The training dataset were generated by extracting 13 signal features (in csv) that are the inputs of AI model. 4) AI model was developed and trained using SAS model studio. 5) The application then called SAS API to identify the specific machining event.

Due to the rapid development of Graph data and representation, many new techniques and applications of GNN are proposed. We describe the main GCN and GAT methods and codes in the presentation video, in fact, we use SAS modules in the code.

SAS_1.pngSAS_2.png

Love the creativity in your videos! 

Version history
Last update:
‎10-20-2022 12:15 PM
Updated by:

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Article Tags