turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Statistical Analysis on Banking data

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-13-2015 12:37 PM

What type of analysis can be done with the following variables? E.g chi square test, hypothesis testing.

Name, address, city, state,zip, balance,product,account id,open date

Basically I need to give insights to bank with their given data.

I tried chi square test between balance and product. But I'm not successful.

I request someone to give a skeleton for any analysis you propose.

Accepted Solutions

Solution

2 weeks ago

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Babloo

05-16-2015 04:29 AM

Customer profitability requires account transaction data such as fees and interest as well as the banks cost of lending and is very complex to work out . If you are only starting out I would start with simpler questions.

To measure the risk of going into default and not repaying their loans you need to track customer account behaviour over years of data looking at repay history and loan balances. If you don't have historical data going over several years then you cannot do this type of analysis. You then need to identify the accounts that weren't paid back and look at the account behaviour prior to this. You are talking many weeks of work to come up with any meaningful results. If you have never done this type of work before then again starting with simpler questions might be better.

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Babloo

05-13-2015 12:44 PM

Have you started off with the basics - summaries?

Average account balance

Distribution of account balance

# of accounts by state/zip/city

# of unique individuals

# of accounts/individual

% of accounts/individuals vs population data for state/city/zip - could be used for deciding where business can expand.

% of account balance by state/city/zip

Age of customers by state/city/zip

For many of the # tables above you could probably run a chi-square test

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Babloo

05-13-2015 01:36 PM

If you want to run something like chi-square for balances you could create ranges of values that mean something to the bank such as 0, 1 to 10000, 10000 to 25000, more than 25000 or similar. The boundaries for the groups ideally would mean the bank treats the customer differently in some manner such as rate changes for loans/deposits, additional offers, reminders, tax affect or similar.

One way to do that would be to create custom format(s) and apply those where you are doing the chi-square. The formatted values will be used to create groups based on the range.

It may be that each product could use different ranges. Also I would expect different behaviors between deposit products, loans and brokerage. So a simple is there a difference in the distribution of values between products that a chi-square provides to be not greatly informative.

Total of balance across accounts for individuals might indicate "valued customer" or similar status.

if the data were at different times and not a single snapshot then changes over time would likely be informative.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Babloo

05-13-2015 03:50 PM

What is the focus of your analysis? Is it customer behaviour or understanding customers? Or is it something else? Without giving us more guidance it is difficult to point you in the right direction. Reeza's list is a good starting point.

I actually work in a bank and there hundreds if not thousands of way you can look at bank data. Usually there is some end game in mind such as marketing, like upselling customers, or it could be financial - finding out who your profitable customers are, or it could be risk-related - who are most likely to not repay their loans.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to SASKiwi

05-14-2015 03:32 AM

I need to understand customer behavior as you said. I wish to find out *'profitable customers are, or it could be risk-related - who are most likely to not repay their loans'.*

Chi square analysis is just an example which I know.I open to do any analysis on your proposal which can be done via EG.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Babloo

05-14-2015 09:28 AM

Then you should take a look at

Logistic Regression - proc logistic

Possion Regression - proc genmod ( can apply to multi-dimension contingency table ,which chi-square is usually to two dimension )

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Babloo

05-14-2015 09:41 AM

Then you should take a look at

Logistic Regression - proc logistic

Possion Regression - proc genmod ( can apply to multi-dimension contingency table ,which chi-square is usually to two dimension )

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ksharp

05-14-2015 09:51 AM

Any idea of providing me the dependent and independent variables for these regression analysis? Any additional statements\options are most welcome.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Babloo

05-14-2015 10:07 AM

It is long story to talk. Actually I am not expert about it ,although my mayor is Economic and Financial .

Make bivariable variable :

0 - fraud

1- not frand

use proc logistic to find which variable is the most important to influence fraud. Of course ,there is a Forecast score probability.

And Rate Possion Regression can make a score for each type custom of fraud . Here is a paper .

24188 - Modeling rates and estimating rates and rate ratios (with confidence intervals)

Xia Keshan

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Babloo

05-14-2015 11:29 AM

Based on your description of your data you don't seem to have variables that would reflect profitable customers or risk related customers.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

05-14-2015 01:44 PM

May I request you to tell me the other analysis that can be performed on my data apart from chi-square test?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Babloo

05-14-2015 02:05 PM

Any analysis pertinent to the question. You're going at it backwards - come up with questions and then figure out how to analyze the data.

Solution

2 weeks ago

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Babloo

05-16-2015 04:29 AM

Customer profitability requires account transaction data such as fees and interest as well as the banks cost of lending and is very complex to work out . If you are only starting out I would start with simpler questions.

To measure the risk of going into default and not repaying their loans you need to track customer account behaviour over years of data looking at repay history and loan balances. If you don't have historical data going over several years then you cannot do this type of analysis. You then need to identify the accounts that weren't paid back and look at the account behaviour prior to this. You are talking many weeks of work to come up with any meaningful results. If you have never done this type of work before then again starting with simpler questions might be better.