BookmarkSubscribeRSS Feed
greg6363
Obsidian | Level 7

I have included the first five observations from my data set:

 

DIR   Vintage   CO_AMT   Total_Recov_Pmts   Recov_Rate

0       JUL2019    $6,000            $2,000                    33.33%

0       AUG2019    $10,000         $1,000                    10.00%

0       SEP2019    $3,000            $1,500                    50.00%

0       OCT2019    $100,000        $20,000                  20.00%

0       NOV2019    $55,000         $30,000                   54.54%

 

The output matrix should look like this:

 

                DIR0  DIR1 DIR2 DIR3 DIR4 DIR5 DIR6 DIR7 DIR8 DIR9 DIR10 .............DIR365

Vintage

JUL2019   3.2%  4.1% 4.5% 5.2% ...................................

SEP2019   2.9%  3.0%  3.2% 3.3%  ...............................

OCT2019

NOV2019

DEC2019

JAN2020

FEB2020

MAR2020

....

....

....

DEC2020

 

A recovery rate (interior matrix) should be calculated for each vintage(vertical axis) at the DIR on the horizontal axis(Days In Recovery).  Thanks.

greg6363
Obsidian | Level 7

The data set in SAS form has been posted above.  Thanks.

mkeintz
PROC Star

I still don't get how you want calculations done.  So here is a sample dataset with 1 vintage (JUL2019) and 6 DIR values (0 through 4 and 25).  Let's say, instead of 366 variables you only want 11:  DIR0 through DIR10.

 

   What would the values be based on the data I provide below?

   What role would the record for DIR=25 have in calculating for DIR0 through DIR10?  I ask that because your dataset has numerous records with DIR greater than 366, which is the requested level you specified.  So should those records be part of the calculations?

 

data have;
  input dir vintage :date9. co_amt total_recov_pmts  recov_rate percent. ;
  format vintage monyy7.   recov_rate percent7.2 ;
datalines;
0    01jul2019   200  50 25%
1    01jul2019   300 150 50%
2    01jul2019   400 250 62.5%
3    01jul2019   500 350 70%
4    01jul2019   600 450 75%
25   01jul2019  1000 677 67.7%
run;
proc print;run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
greg6363
Obsidian | Level 7

My objective is to calculate the cumulative recovery rate at every day in recovery (DIR) for each respective vintage (charge-off month).  Let me show you the manual calculation from your data example:

 

Vintage    DIR  CO_AMT TOTAL_RECOV_PMTS  RECOV_RATE  CUM_RECOV_RATE
01jul2019   0   200           50             25%       50/100,000 = .05%
01jul2019   1   300           150            50%      200/100,000 = .20%
01jul2019   2   400           250           62.5%     450/100,000 = .45%
01jul2019   3   500           350            70%      800/100,000 = .80%
01jul2019   4   1000          677           67.7%     1,477/100,000 = 1.477%

The cumulative recovery rate would sum the total recovery payments for all the DIR values
in the vintage and divide by the total charge-off amounts for the vintage. If the total charge-off amount for the vintage (July 2019) was $100,000, you would sum each consecutive DIR and divide by the $100,000 total charge-off amount.  The calculation for each cell is included in the above data set.  Hopefully this explanation makes the objective clear. 
greg6363
Obsidian | Level 7

The cumulative recovery rate would sum the total recovery payments for all the DIR values in the vintage and divide by the total charge-off amounts for the vintage.  If the total charge-off amount for the vintage (July 2019) was $100,000, you would sum each consecutive DIR and divide by the $100,00 total charge-off amount.  The calculation for each cell is included in the data set you provided in the post.  Hopefully this explanation makes the objective clear.

mkeintz
PROC Star

 

OK.

 

  1. You need to sum all the charge offs for each division to get the divisor you want.  You can do that with a proc summary. 

  2. Then you need, for each vintage, to sequentially read in the DIR=0 record, then the DIR=1 record, through the DIR365 record, calculating the cumulative recovery rate.  So you have to sort the sample data by VINTAGE/DIR (dataset SAPLE_SORTED below).  Then merge with the sum's in #1 above.  That would be a data step with a merge statement.

  3. But instead of (up to) 366 records, you want 366 variables: CUM_RATE0 through CUM_RATE365.  In this case that is solved via an array declaration, which will be indexed by DIR
proc summary data=sample_data noprint nway;
  class vintage;
  var co_amt;
  output out=need (keep=vintage co_amt_sum) sum=co_amt_sum;
run;

proc sort data=sample_data out=sample_sorted;
  by vintage dir;
run;

data want (keep=vintage cum_rate0-cum_rate365);
  merge sample_sorted need;
  by vintage;
  array crate {0:365} cum_rate0-cum_rate365;
  retain cum_rate: ;

  if first.vintage then cr=0;
  cr + (total_recov_pmts/co_amt_sum);
  format cum_rate: percent7.2;
  if dir<=365 then crate{dir}=cr;
  if last.vintage ;
  do d=1 to 365;
    if crate{d}=. then crate{d}=crate{d-1};
  end;
run;

Some DIR values may be missing  (say you have DIR=0 through DIR=13, followed by DIR=16.  In that case you want to propagate the cum_rate13 through cum_rate14 and cum_rate15, before updating the value for cum_rate16.  That's the do loop above.

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
greg6363
Obsidian | Level 7

My output generates zero percentages for each vintage @ every DIR.  Let me debug this code to see what's I doing wrong.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 21 replies
  • 1300 views
  • 1 like
  • 5 in conversation