I have included the first five observations from my data set:
DIR Vintage CO_AMT Total_Recov_Pmts Recov_Rate
0 JUL2019 $6,000 $2,000 33.33%
0 AUG2019 $10,000 $1,000 10.00%
0 SEP2019 $3,000 $1,500 50.00%
0 OCT2019 $100,000 $20,000 20.00%
0 NOV2019 $55,000 $30,000 54.54%
The output matrix should look like this:
DIR0 DIR1 DIR2 DIR3 DIR4 DIR5 DIR6 DIR7 DIR8 DIR9 DIR10 .............DIR365
Vintage
JUL2019 3.2% 4.1% 4.5% 5.2% ...................................
SEP2019 2.9% 3.0% 3.2% 3.3% ...............................
OCT2019
NOV2019
DEC2019
JAN2020
FEB2020
MAR2020
....
....
....
DEC2020
A recovery rate (interior matrix) should be calculated for each vintage(vertical axis) at the DIR on the horizontal axis(Days In Recovery). Thanks.
The data set in SAS form has been posted above. Thanks.
I still don't get how you want calculations done. So here is a sample dataset with 1 vintage (JUL2019) and 6 DIR values (0 through 4 and 25). Let's say, instead of 366 variables you only want 11: DIR0 through DIR10.
What would the values be based on the data I provide below?
What role would the record for DIR=25 have in calculating for DIR0 through DIR10? I ask that because your dataset has numerous records with DIR greater than 366, which is the requested level you specified. So should those records be part of the calculations?
data have;
input dir vintage :date9. co_amt total_recov_pmts recov_rate percent. ;
format vintage monyy7. recov_rate percent7.2 ;
datalines;
0 01jul2019 200 50 25%
1 01jul2019 300 150 50%
2 01jul2019 400 250 62.5%
3 01jul2019 500 350 70%
4 01jul2019 600 450 75%
25 01jul2019 1000 677 67.7%
run;
proc print;run;
My objective is to calculate the cumulative recovery rate at every day in recovery (DIR) for each respective vintage (charge-off month). Let me show you the manual calculation from your data example:
Vintage DIR CO_AMT TOTAL_RECOV_PMTS RECOV_RATE CUM_RECOV_RATE
01jul2019 0 200 50 25% 50/100,000 = .05%
01jul2019 1 300 150 50% 200/100,000 = .20%
01jul2019 2 400 250 62.5% 450/100,000 = .45%
01jul2019 3 500 350 70% 800/100,000 = .80%
01jul2019 4 1000 677 67.7% 1,477/100,000 = 1.477%
The cumulative recovery rate would sum the total recovery payments for all the DIR values
in the vintage and divide by the total charge-off amounts for the vintage. If the total charge-off amount for the vintage (July 2019) was $100,000, you would sum each consecutive DIR and divide by the $100,000 total charge-off amount. The calculation for each cell is included in the above data set. Hopefully this explanation makes the objective clear.
The cumulative recovery rate would sum the total recovery payments for all the DIR values in the vintage and divide by the total charge-off amounts for the vintage. If the total charge-off amount for the vintage (July 2019) was $100,000, you would sum each consecutive DIR and divide by the $100,00 total charge-off amount. The calculation for each cell is included in the data set you provided in the post. Hopefully this explanation makes the objective clear.
OK.
proc summary data=sample_data noprint nway;
class vintage;
var co_amt;
output out=need (keep=vintage co_amt_sum) sum=co_amt_sum;
run;
proc sort data=sample_data out=sample_sorted;
by vintage dir;
run;
data want (keep=vintage cum_rate0-cum_rate365);
merge sample_sorted need;
by vintage;
array crate {0:365} cum_rate0-cum_rate365;
retain cum_rate: ;
if first.vintage then cr=0;
cr + (total_recov_pmts/co_amt_sum);
format cum_rate: percent7.2;
if dir<=365 then crate{dir}=cr;
if last.vintage ;
do d=1 to 365;
if crate{d}=. then crate{d}=crate{d-1};
end;
run;
Some DIR values may be missing (say you have DIR=0 through DIR=13, followed by DIR=16. In that case you want to propagate the cum_rate13 through cum_rate14 and cum_rate15, before updating the value for cum_rate16. That's the do loop above.
My output generates zero percentages for each vintage @ every DIR. Let me debug this code to see what's I doing wrong.
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.