BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Malewar
Fluorite | Level 6

Hi There,

I'm trying to calculate the default value from the available data. I have merged the data by using following code
 
Data Loan_1;
Set 36_Months 60 Months;
Run;
 
Then I have sorted the data using proc sort
After that i have created a variable named Loan_Status2 in which I have created to values 1st is Current and 2nd is default using following code
 
Data Loan_11;
Set Loan_1;
If loan_Status = 'Late (31-120 Days)' or Loan_Status = 'Default' then Loan_Status2 = 1;
Else If loan_Status = 'Current' or loan_Status = 'Late (16-30 Days)' or loan_Status = 'In Grace Period' or loan_Status = 'Charged Off' or loan_Status = 'Fully Paid'  then loan_Status2 = 0;
 
After this I'm trying to calculate the default % by using following code
 

Data Loan_111;
Set Loan_11;
DefaultLoan  = (Sum (1)/Sum(Loan_Status2));
Format DefaultLoan $Char20.;
Proc Print;
Run;
 
But getting wrong output
 
please do the needful.
 
Regards,
Roshan Malewar

 

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

First, let's assume that you have working code that accurately creates the data set LOAN_11.  There are several areas where the syntax is questionable, so it may have been copied incorrectly in the posted code.

 

The SUM function does not summarize across observations.  It sums up within a single observation only.  To get the rate that you are asking for, I would guess you are looking for:

 

proc means data=loan_11 mean;

var loan_status2;

run;

 

The mean value for LOAN_STATUS2 will be the percent of loans that are in default.

 

As a general rule, think about using a procedure, not a DATA step, to summarize across observations.

 

 

View solution in original post

3 REPLIES 3
Astounding
PROC Star

First, let's assume that you have working code that accurately creates the data set LOAN_11.  There are several areas where the syntax is questionable, so it may have been copied incorrectly in the posted code.

 

The SUM function does not summarize across observations.  It sums up within a single observation only.  To get the rate that you are asking for, I would guess you are looking for:

 

proc means data=loan_11 mean;

var loan_status2;

run;

 

The mean value for LOAN_STATUS2 will be the percent of loans that are in default.

 

As a general rule, think about using a procedure, not a DATA step, to summarize across observations.

 

 

Malewar
Fluorite | Level 6

Thank you very much Super User. It worked fine.  

ballardw
Super User

If you have listed all of the values for loan_status in your code you might find this interesting.

proc format;
value $default
 'Late (31-120 Days)', 'Default'  ='In Default'
 'Current', 'Late (16-30 Days)', 'In Grace Period', 
 'Charged Off', 'Fully Paid'  = 'Not Default'
 ;
 run;

proc freq data=Loan_1;
   tables loan_status;
   format loan_status $default.;
run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 674 views
  • 1 like
  • 3 in conversation