I have created an aggregate table for recovery rates with the following fields:
DIR (Days In Recovery)
Vintage (Month)
CO_Amt (ChargeOff Amount)
Total_Recov_Pmts (Total Recovery Payments)
Recov_Rate (Recovery Rate = Total Recovery Payments / ChargeOff Amount)
Now I am trying to create a matrix with cumulative recovery rates for each monthly vintage by DIR.
I am attempting to generate the cumulative recovery rate at every consecutive DIR (day of recovery) for each monthly vintage with the following macro loop:
data cum_recov_rates;
set master;
%macro recov_rate;
%do i=0 %to 365;
recov_rate&i = sum(of DIR&i)/co_amt;
%end;
%mend;
%recov_rate;
keep DIR Vintage recov_rate0-recov_rate365;
run;
I want the vintage to be on the vertical axis and the Days In Recovery (DIR) on the horizontal axis. Unfortunately, I generate the following error message:
NOTE: Variable DIR0 is uninitialized. (Each individual DIR is uninitialized)
I have the DIR variable in the data set so I'm confused on why the log is generating this error.
Any assistance would be greatly appreciated. Thanks.
No need for macros here. A DATA step without macros should be able to do this.
Please present us with sample data (following these instructions) and then show us the desired output.
Let me give you the first five observations of the data set:
DIR Vintage CO_AMT Total_Recov_Pmts Recov_Rate
0 JUL2019 $6,000 $2,000 33.33%
0 AUG2019 $10,000 $1,000 10.00%
0 SEP2019 $3,000 $1,500 50.00%
0 OCT2019 $100,000 $20,000 20.00%
0 NOV2019 $55,000 $30,000 54.54%
Here is how the output matrix should look:
DIR0 DIR1 DIR2 DIR3 DIR4 DIR5 DIR6 DIR7 DIR8 DIR9 DIR10 .............DIR365
Vintage
JUL2019 3.2% 4.1% 4.5% 5.2% ...................................
SEP2019 2.9% 3.0% 3.2% 3.3% ...............................
OCT2019
NOV2019
DEC2019
JAN2020
FEB2020
MAR2020
....
....
....
DEC2020
A recovery rate (interior matrix) should be calculated for each vintage(vertical axis) at the DIR on the horizontal axis(Days In Recovery). Let me know if you need any other information. Thanks.
@greg6363 wrote:
Let me give you the first five observations of the data set:
DIR Vintage CO_AMT Total_Recov_Pmts Recov_Rate
0 JUL2019 $6,000 $2,000 33.33%
0 AUG2019 $10,000 $1,000 10.00%
0 SEP2019 $3,000 $1,500 50.00%
0 OCT2019 $100,000 $20,000 20.00%
0 NOV2019 $55,000 $30,000 54.54%
A specific set of instructions was provided so you can give us the data in a usable format. Please follow the instructions. Please provide a larger EXAMPLE data set which includes DIR=1 and DIR=2 as obviously we need to write a program that uses those as well.
Here is how the output matrix should look:
DIR0 DIR1 DIR2 DIR3 DIR4 DIR5 DIR6 DIR7 DIR8 DIR9 DIR10 .............DIR365
Vintage
JUL2019 3.2% 4.1% 4.5% 5.2% ...................................
SEP2019 2.9% 3.0% 3.2% 3.3% ...............................
OCT2019
NOV2019
DEC2019
JAN2020
FEB2020
MAR2020
....
....
....
DEC2020
A recovery rate (interior matrix) should be calculated for each vintage(vertical axis) at the DIR on the horizontal axis(Days In Recovery). Let me know if you need any other information.
How do you get 3.2% for JUL2019 and DIR0? Since your data doesn't show any DIR=1, can you show us so we have a chance of computing the "cumulative" part of this (I assume that's where the "cumulative" comes from). Can you provide example data that matches the desired output?
I have enclosed the data set. Let me know what I am missing in order to generate a cumulative rate. Thanks.
Enclosed is the desired output layout. Thanks.
I included the formula for the recovery rate in the variable description list:
Recov_Rate (Recovery Rate = Total Recovery Payments / ChargeOff Amount)
The DIR (Days In Recovery) variable has 365 values (one for each day). I want to turn each value into its own variable so that I can calculate the recovery rate at each day which is what I thought the do loop would create within the macro. Wouldn't an array be too long to manage? Let me know what I am missing. Thanks.
OK, I have enclosed the data set. For each vintage, I want to calculate the cumulative recovery rate for each value (0,1,2,....,n) in the DIR column. You have the aggregate recovery rate for each individual vintage at the DIR (daysinrecovery) in the last column so I want to add each subsequent daily rate to generate a running total for the individual vintage by DIR. I hope this information provides a better explanation. I thought running a macro would be easier but let me know if the array is better. Thanks.
We specifically asked for usable data according to specific instructions.
Excel files are generally not usable. Many people will not even download them as they are a security threat.
The best way for you to get answers is for you to provide the data using the method specified, instead of ignoring what we said and having us ask three times.
Also, I don't see an explanation of how you got 3.2% in your sample output.
My apologies. I clicked on the wrong file. Here is the sample SAS data set used for my program. Thanks.
As for an explanation of the formula for the recovery rate, the aggregate recovery rate is the quotient of the total_recov_pmts and the co_amt. In the data set, the recovery rate divides the total recovery payments field by the co_amt for the respective vintage on the particular DIR (Days In Recovery) value.
The goal of the cumulative recovery rate is to add each subsequent recovery rate as a percentage of the overall recovery rate. For instance, if the total (aggregate) recovery rate for October 2019 is 28.47%, I want to know the incremental recovery rate at each DIR for that particular vintage. If DIR1 is 0.57%, DIR2 is 0.33% and DIR3 is 0.22%, then the cumulative recovery rate at DIR3 should equal 1.12% and continue to accumulate with each consecutive DIR until it reaches 28.47% where it will stop. Hopefully this explanation provides enough context. Thanks.
First thing. It is very bad practice to attempt to put a macro definition in the middle of a data step or other procedure. It makes it hard to follow actual execution flow.
You apparently have exactly one existing variable Dir in the data set master.
Your code references 366 variables that don't exist. that is you macro generating with Dir&i .
Example data for maybe 5 or 10 "months" and what you expect the result to look like.
SAS data sets do not have "axis" per se. Horizontal is variables, vertical is observations.
You may check if you have access to SAS/IML as that sounds like more of the tool that you may want.
Unfortunately, I don't have a license for SAS/IML.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.