BookmarkSubscribeRSS Feed
greg6363
Obsidian | Level 7

I have created an aggregate table for recovery rates with the following fields:

 

DIR (Days In Recovery)

Vintage (Month)

CO_Amt (ChargeOff Amount)

Total_Recov_Pmts (Total Recovery Payments)

Recov_Rate (Recovery Rate = Total Recovery Payments / ChargeOff Amount)

 

Now I am trying to create a matrix with cumulative recovery rates for each monthly vintage by DIR.

 

I am attempting to generate the cumulative recovery rate at every consecutive DIR (day of recovery) for each monthly vintage with the following macro loop:

 

data cum_recov_rates;
set master;
%macro recov_rate;
%do i=0 %to 365;
recov_rate&i = sum(of DIR&i)/co_amt;
%end;

%mend;
%recov_rate;
keep DIR Vintage recov_rate0-recov_rate365;
run;

 

I want the vintage to be on the vertical axis and the Days In Recovery (DIR) on the horizontal axis.  Unfortunately, I generate the following error message:

NOTE: Variable DIR0 is uninitialized. (Each individual DIR is uninitialized)

 

I have the DIR variable in the data set so I'm confused on why the log is generating this error.

 

Any assistance would be greatly appreciated.  Thanks.

 

21 REPLIES 21
PaigeMiller
Diamond | Level 26

No need for macros here. A DATA step without macros should be able to do this.

 

Please present us with sample data (following these instructions) and then show us the desired output.

--
Paige Miller
greg6363
Obsidian | Level 7

Let me give you the first five observations of the data set:

 

DIR   Vintage   CO_AMT   Total_Recov_Pmts   Recov_Rate

0       JUL2019    $6,000            $2,000                    33.33%

0       AUG2019    $10,000         $1,000                    10.00%

0       SEP2019    $3,000            $1,500                    50.00%

0       OCT2019    $100,000        $20,000                  20.00%

0       NOV2019    $55,000         $30,000                   54.54%

 

Here is how the output matrix should look:

 

                DIR0  DIR1 DIR2 DIR3 DIR4 DIR5 DIR6 DIR7 DIR8 DIR9 DIR10 .............DIR365

Vintage

JUL2019   3.2%  4.1% 4.5% 5.2% ...................................

SEP2019   2.9%  3.0%  3.2% 3.3%  ...............................

OCT2019

NOV2019

DEC2019

JAN2020

FEB2020

MAR2020

....

....

....

DEC2020

 

A recovery rate (interior matrix) should be calculated for each vintage(vertical axis) at the DIR on the horizontal axis(Days In Recovery).  Let me know if you need any other information.  Thanks.

PaigeMiller
Diamond | Level 26

@greg6363 wrote:

Let me give you the first five observations of the data set:

 

DIR   Vintage   CO_AMT   Total_Recov_Pmts   Recov_Rate

0       JUL2019    $6,000            $2,000                    33.33%

0       AUG2019    $10,000         $1,000                    10.00%

0       SEP2019    $3,000            $1,500                    50.00%

0       OCT2019    $100,000        $20,000                  20.00%

0       NOV2019    $55,000         $30,000                   54.54%

 


A specific set of instructions was provided so you can give us the data in a usable format. Please follow the instructions. Please provide a larger EXAMPLE data set which includes DIR=1 and DIR=2 as obviously we need to write a program that uses those as well.

 

 

Here is how the output matrix should look:

 

                DIR0  DIR1 DIR2 DIR3 DIR4 DIR5 DIR6 DIR7 DIR8 DIR9 DIR10 .............DIR365

Vintage

JUL2019   3.2%  4.1% 4.5% 5.2% ...................................

SEP2019   2.9%  3.0%  3.2% 3.3%  ...............................

OCT2019

NOV2019

DEC2019

JAN2020

FEB2020

MAR2020

....

....

....

DEC2020

 

A recovery rate (interior matrix) should be calculated for each vintage(vertical axis) at the DIR on the horizontal axis(Days In Recovery).  Let me know if you need any other information.

 

How do you get 3.2% for JUL2019 and DIR0? Since your data doesn't show any DIR=1, can you show us so we have a chance of computing the "cumulative" part of this (I assume that's where the "cumulative" comes from). Can you provide example data that matches the desired output?

--
Paige Miller
greg6363
Obsidian | Level 7

I have enclosed the data set.  Let me know what I am missing in order to generate a cumulative rate.  Thanks.

greg6363
Obsidian | Level 7

Enclosed is the desired output layout.  Thanks.

Reeza
Super User
sum(of DIR&i) => what are you expecting in this line? That would only reference a single value ever at a time so the OF isn't necessary but is it what you intended?
I would also suggest an array, not a macro, as there's no need and it may not work the way you intend it to.

And what is the math formula you're trying to implement? I'm not familiar with the term recovery rate so not sure what the calculation should be when correct.
greg6363
Obsidian | Level 7

I included the formula for the recovery rate in the variable description list:  

 

Recov_Rate (Recovery Rate = Total Recovery Payments / ChargeOff Amount)

 

The DIR (Days In Recovery) variable has 365 values (one for each day).  I want to turn each value into its own variable so that I can calculate the recovery rate at each day which is what I thought the do loop would create within the macro.  Wouldn't an array be too long to manage?  Let me know what I am missing.  Thanks.

Reeza
Super User
You don't need IML but you do need to provide sample data, preferably using the sample data instructions Paige included above.
You need an array, an array is not long since it's indexed so I have no idea what you mean by an array being too long to manage.
Restructuring your data to start off may make all of this significantly simpler anyways, once the full example is prepared I suspect that's the optimal solution.

Axis are terms used for graphs not data.
greg6363
Obsidian | Level 7

OK, I have enclosed the data set.  For each vintage, I want to calculate the cumulative recovery rate for each value (0,1,2,....,n) in the DIR column.  You have the aggregate recovery rate for each individual vintage at the DIR (daysinrecovery) in the last column so I want to add each subsequent daily rate to generate a running total for the individual vintage by DIR.  I hope this information provides a better explanation.  I thought running a macro would be easier but let me know if the array is better.  Thanks.

PaigeMiller
Diamond | Level 26

We specifically asked for usable data according to specific instructions.


Excel files are generally not usable. Many people will not even download them as they are a security threat.

 

The best way for you to get answers is for you to provide the data using the method specified, instead of ignoring what we said and having us ask three times.

 

Also, I don't see an explanation of how you got 3.2% in your sample output.

--
Paige Miller
greg6363
Obsidian | Level 7

My apologies.  I clicked on the wrong file.  Here is the sample SAS data set used for my program.  Thanks.

greg6363
Obsidian | Level 7

As for an explanation of the formula for the recovery rate, the aggregate recovery rate is the quotient of the total_recov_pmts and the co_amt.  In the data set, the recovery rate divides the total recovery payments field by the co_amt for the respective vintage on the particular DIR (Days In Recovery) value.

 

The goal of the cumulative recovery rate is to add each subsequent recovery rate as a percentage of the overall recovery rate.  For instance, if the total (aggregate) recovery rate for October 2019 is 28.47%, I want to know the incremental recovery rate at each DIR for that particular vintage.  If DIR1 is 0.57%, DIR2 is 0.33% and DIR3 is 0.22%, then the cumulative recovery rate at DIR3 should equal 1.12% and continue to accumulate with each consecutive DIR until it reaches 28.47% where it will stop.  Hopefully this explanation provides enough context.  Thanks.

ballardw
Super User

First thing. It is very bad practice to attempt to put a macro definition in the middle of a data step or other procedure. It makes it hard to follow actual execution flow.

 

You apparently have exactly one existing variable Dir in the data set master.

Your code references 366 variables that don't exist. that is you macro generating with Dir&i .

 

Example data for maybe 5 or 10 "months" and what you expect the result to look like.

 

SAS data sets do not have "axis" per se. Horizontal is variables, vertical is observations.

 

You may check if you have access to SAS/IML as that sounds like more of the tool that you may want.

greg6363
Obsidian | Level 7

Unfortunately, I don't have a license for SAS/IML.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 21 replies
  • 1770 views
  • 1 like
  • 5 in conversation