Help turn code into Data Step

Reply
Super Contributor
Posts: 418

Help turn code into Data Step

Hello everyone. I have a piece of code that I wrote using a macro loop, and It is very inefficient (takes ~ .5-.6 seconds to run, and I need to run it at least 10,000 times). Please see the code attached below.

Can anyone take a peek at the code and try to replace it with more efficient code (I am assuming a datastep would be better, but am having problems with lagging variables due to the logic).

The main logic is "

1) for each loan it has a 3% chance to default each Month, IF it is not already deaulted.

2) If the loan is already defaulted it has a 99% chance to REMAIN DEFAULTED, and a 1% chance to "CURE" itself back to current).

I can't think of a way to do this in a datastep because of the fault that you need the previous months default to determine the odds of a loan being defaulted in the current month, however you have to set the previous default on whether or not it was already defaulted, so it seems like a circular logic reference.

Any help would be appreciated. I need to get the entire code run time to < .1 seconds if at all possible. I am stuck on where to go next. 

If anyone has any questions with the code please let me know!

data SummaryData;
   infile datalines delimiter=',';
   input NumberOfloans  pmtamount Category;
   length pmtamount 4.;
   datalines;                     
5000,1200,1
10000,1500,2
;

data montecarlomaybe(drop=i p Category  NumberOfloans);
set SummaryData;
length loannumber $9.;
do i=1 to NumberofLoans;
do p=1 to 12;
LoanNumber=strip(put(category,6.))||strip(put(i,6.));
month=p;
output;
end;
end;
run;

%macro LoopVar;
data starttime;
starttime=datetime();
run;


%do i=1 %to 12;

%let Ione=%eval(&i-1);

%if &i=1 %then %do;
data montecarlomaybe&i.(drop=newvar);
set montecarlomaybe;
length month default mindefaultmonth monthssincedefault 3.;
where month=&i.;
newvar=ranuni(0);
if newvar<.03 then do;
default=1;
MinDefaultMonth=1;
end;
MonthsSinceDefault=.;
run;
%end;
%else %do;

data montecarlomaybe&i.(drop=Curedrand defaultrand);
set montecarlomaybe&Ione.;
length month default mindefaultmonth monthssincedefault 3.;
Month=&i.;
monthssincedefault=Month-MinDefaultMonth;
Curedrand=ranuni(0);
defaultrand=ranuni(0);
if default=1 and curedrand < .01 then do;
default=.;
MinDefaultMonth=.;
monthssincedefault=.;
end;
else if default=1 and curedrand > .01  then do;
Default=1;
MinDefaultMonth=MinDefaultMonth;
end;
else if missing(default)=1  and defaultrand<.03 then do;
default=1;
MinDefaultMonth=Month;
end;
run;
%end;

proc datasets;
append base=Analysis
data=montecarlomaybe&i.;
run;

proc sort data=analysis;
by loannumber month;
run;

data starttime1;
set starttime;
endtime=datetime();
totaltime=endtime-starttime;
run;
%end;
%mend LoopVar;
%loopvar

Super User
Posts: 5,092

Re: Help turn code into Data Step

I think this becomes much easier, faster, and eliminates all the macro language entirely ... if ... you are willing to reconsider the structure of the data.  Instead of one observation per loan/month, create one observation per loan.  You can create a set of 12 variables:  in_default_1 - in_default_12.  Those should be easy to generate, using the random number generators you are already calculating.  Just assign a value for the first month, and use the random numbers to determine whether to change the value in the next month.  If you need to, you can calculate additional variables such as first default month, months since default, etc., once all 12 "in default" variables have been calculated.  Arrays make this type of processing fast and easy, but it begins with picturing what you would like as the new structure for the data.

Super User
Posts: 3,113

Re: Help turn code into Data Step

Astounding's suggestion is very much how we do Monte Carlo simulations of loan defaults.

Array processing can be blindingly fast as all of the processing happens in memory.

We have much more complicated default simulations going over 5 years by month for at least 10,000 loans and it all runs in 2 or 3 minutes on a 4-core SAS BI server.

Super Contributor
Posts: 418

Re: Help turn code into Data Step

Hi Master

Super User
Posts: 5,092

Re: Help turn code into Data Step

Well, you first have to decide whether you want 12 "months since default" variables or just 1 representing the final result.  Here's the code if you just want a single, final result.  All solutions assume you have already populated 12 default status variables.

data want;

set have;

array default {12} default_m1 - default_m12;

* after calculating default_m1 - default_m12;

months_since_default=0;
do _n_=1 to 12;

   if default{_n_}=0 then months_since_default=0;

   else months_since_default + 1;

end;

run;

At the end, you have to consider whether the final value of months_since_default is accurate, or whether you want to subtract 1 from the positive values.

Good luck.

Ask a Question
Discussion stats
  • 4 replies
  • 240 views
  • 0 likes
  • 3 in conversation