BookmarkSubscribeRSS Feed
UshaLatha
Obsidian | Level 7

Hi All,

 

I need some help in analysis of residential mortgage data.

I have some fields like FICO score , LTV , Default flag in the data. 

FICOLTVDEFAULT_FLAG
551501
752751
713831
400900

 

I need a matrix which can give the joint distribution of FICO and LTV for the default rate. For eg, I have 1000 loans, of which 239 have defaulted (i.e default_flag=1) , which have LTV<70 and FICO<710, so the default rate should be 239/1000=0.239.. Similarly distribution for all the bucket ranges of LTV and FICO score is needed.

My desired distribution table should be like this:

  LTV
  Low  (<70)Medium (70,80)High(80,85)Very high(>85)
FICOLow (<710)0.2394.955.539.97
 Medium(710,750)1.53.424.25.2
 High(750,755)0.233.212.313.2
 Very high >7550.340.260.781.34

 

The code I have used is as follows:

 

proc tabulate data=test;
format LTV ltv_fmt. FICO fico_fmt. ;
class FICO LTV;
var DEFAULT_FLAG;
table FICO, LTV*DEFAULT_FLAG=" "*(N="count" PCTN="%");
run;

This code is giving the count of rows (irrespective of value of default_flag value) which fall into that particular ranges of LTV and FICO. But I need count of rows where DEFAULT_FLAG=1.

Please help me with this.

 

Thanks

4 REPLIES 4
HB
Barite | Level 11 HB
Barite | Level 11

Can you subset the data before the Proc Tabulate?

 

data want;
	set have;
	if default_flag = 1;
run;
proc tabulate data=want

 

 

UshaLatha
Obsidian | Level 7

Hi,

 

I need the counts to calculate the % of data with default_flag=1. 

So sub-setting of data would not help.

ballardw
Super User

@UshaLatha wrote:

Hi,

 

I need the counts to calculate the % of data with default_flag=1. 

So sub-setting of data would not help.


 

Sum of Default_flag will give count of values =1 if default_flag is only coded 1/0.

 

If you want (count of default_flag = 1 for intersection of formatted LFV and FICO / total number of records in data set) as a percentage you will need that to be calculated outside of proc tabulate.

If you want (count of default_flag = 1 for intersection of formatted LFV and FICO / total number of default_flag = 1 in data set)  as a percentage then you would want to make default flag a class variable with pctn but that would be also likely violate some version OR SUBSET the data to default_flag=1 with a where statement and N and PCTN for the other class variables on the table statement.

 

 

You might have to more completely describe you numerator and denominators for the percentage if it is something else you want for a percentage.

 

Or provide a small data set and show us what the result for that data set should be.

ballardw
Super User

Assuming your formats work as intended and you want a table only related to the default_flag=1 then I would start with:

proc tabulate data=test;
   where default_flag=1;
   format LTV ltv_fmt. FICO fico_fmt. ;
   class FICO LTV;

   table FICO ,
         LTV*(N="count" PCTN="%");
run;

or possibly to get one table for each level of default_flag

 

proc tabulate data=test;
   class  default_flag;
   format LTV ltv_fmt. FICO fico_fmt. ;
   class FICO LTV;

   table default_flag,
         FICO ,
         LTV*(N="count" PCTN="%");
run;

BTW formats that you cannot tell what the actual end points are such as

'Medium(710,750)' and 'High(750,755)' are very problematic. Which category does a score of exactly 750 fit into? Not obvious from the displayed text.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 1856 views
  • 2 likes
  • 3 in conversation