BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Malewar
Fluorite | Level 6

Hi There,

I'm trying to calculate the default value from the available data. I have merged the data by using following code
 
Data Loan_1;
Set 36_Months 60 Months;
Run;
 
Then I have sorted the data using proc sort
After that i have created a variable named Loan_Status2 in which I have created to values 1st is Current and 2nd is default using following code
 
Data Loan_11;
Set Loan_1;
If loan_Status = 'Late (31-120 Days)' or Loan_Status = 'Default' then Loan_Status2 = 1;
Else If loan_Status = 'Current' or loan_Status = 'Late (16-30 Days)' or loan_Status = 'In Grace Period' or loan_Status = 'Charged Off' or loan_Status = 'Fully Paid'  then loan_Status2 = 0;
 
After this I'm trying to calculate the default % by using following code
 

Data Loan_111;
Set Loan_11;
DefaultLoan  = (Sum (1)/Sum(Loan_Status2));
Format DefaultLoan $Char20.;
Proc Print;
Run;
 
But getting wrong output
 
please do the needful.
 
Regards,
Roshan Malewar

 

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

First, let's assume that you have working code that accurately creates the data set LOAN_11.  There are several areas where the syntax is questionable, so it may have been copied incorrectly in the posted code.

 

The SUM function does not summarize across observations.  It sums up within a single observation only.  To get the rate that you are asking for, I would guess you are looking for:

 

proc means data=loan_11 mean;

var loan_status2;

run;

 

The mean value for LOAN_STATUS2 will be the percent of loans that are in default.

 

As a general rule, think about using a procedure, not a DATA step, to summarize across observations.

 

 

View solution in original post

3 REPLIES 3
Astounding
PROC Star

First, let's assume that you have working code that accurately creates the data set LOAN_11.  There are several areas where the syntax is questionable, so it may have been copied incorrectly in the posted code.

 

The SUM function does not summarize across observations.  It sums up within a single observation only.  To get the rate that you are asking for, I would guess you are looking for:

 

proc means data=loan_11 mean;

var loan_status2;

run;

 

The mean value for LOAN_STATUS2 will be the percent of loans that are in default.

 

As a general rule, think about using a procedure, not a DATA step, to summarize across observations.

 

 

Malewar
Fluorite | Level 6

Thank you very much Super User. It worked fine.  

ballardw
Super User

If you have listed all of the values for loan_status in your code you might find this interesting.

proc format;
value $default
 'Late (31-120 Days)', 'Default'  ='In Default'
 'Current', 'Late (16-30 Days)', 'In Grace Period', 
 'Charged Off', 'Fully Paid'  = 'Not Default'
 ;
 run;

proc freq data=Loan_1;
   tables loan_status;
   format loan_status $default.;
run;

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1347 views
  • 1 like
  • 3 in conversation