I have SAS code that returns variables with their bins scores. After that, new variables are created based on their bins score. I want to automate this process.
For example, below is the table that is used to create new variables with suffix "_sc".
Variable | Min_Bin | Max_Bin | Score |
Avg_Balance_last_6M | 25 | 50 | 0.06 |
Avg_Balance_last_6M | 51 | 75 | 0.05 |
Avg_Balance_last_6M | 76 | 100 | 0.07 |
Fee_1_3M | 0.005 | 0.009 | 0.007 |
Fee_1_3M | 0.01 | 0.03 | 0.06 |
Fee_1_3M | 0.03 | 0.05 | 0.02 |
Fee_1_3M | 0.05 | 0.07 | 0.03 |
Manual Code :
IF Avg_Balance_last_6M >= 25 then Avg_Balance_last_6M_sc = 0.06;
IF Avg_Balance_last_6M >= 51 then Avg_Balance_last_6M_sc = 0.05;
IF Avg_Balance_last_6M >= 76 then Avg_Balance_last_6M_sc = 0.07;
IF Fee_1_3M >= 0.005 then Fee_1_3M_sc = 0.007;
IF Fee_1_3M >= 0.01 then Fee_1_3M_sc = 0.06;
IF Fee_1_3M >= 0.03 then Fee_1_3M_sc = 0.02;
IF Fee_1_3M >= 0.05 then Fee_1_3M_sc = 0.03;
I would be very tempted to use the information in your first table to create custom format or informat with the Min_Bin variable as START, Max_bin as End and Score as Label. Then I could either just use the formatted value of the existing variable or make an informat to create the variables with.
data formatcontrol;
set have (rename=(Variable=FMTName); /* if any of the variables end in a numeral then we'll need to address that*/
Start= put(Min_bin, best8.);
End = put(Max_bin,Best8.);
Label = put (score,z9.7);
run;
proc format cntlin=formatcontrol;
run;
Then you use the formats such as :
Proc freq data=have;
table Avg_Balance_last_6M Fee_1_3M;
format Avg_Balance_last_6M Avg_Balance_last_6M. Fee_1_3M Fee_1_3M.;
run;
Or create an informat and then
use Fee_1_3M_sc = input (Fee_1_3M,Fee_1_3M.);
I strongly recommend that you don't do this alone. Find a local senior SAS programmer who can guide you through some of the issues. Based on what you have posted so far, here are two issues that you may want to consider.
1. What if a variable name is already 30 characters long? There's won't be room to add "_rc" to the end of it. How should that be handled?
2. In an automated system, it is often important to make the code as speedy as possible. The manual code you have posted may run quickly enough, or you may want to speed it up by adding ELSE. For example:
IF Avg_Balance_last_6M >= 76 then Avg_Balance_last_6M_sc = 0.07;
else IF Avg_Balance_last_6M >= 51 then Avg_Balance_last_6M_sc = 0.05;
else IF Avg_Balance_last_6M >= 25 then Avg_Balance_last_6M_sc = 0.06;
I would judge it extremely likely that a senior programmer would find additional issues and shortcuts, and would definitely recommend that route.
Good luck.
I recommend using call execute for this:
Riya88 wrote:
I have SAS code that returns variables with their bins scores. After that, new variables are created based on their bins score. I want to automate this process.
For example, below is the table that is used to create new variables with suffix "_sc".
Variable | Min_Bin | Max_Bin | Score |
Avg_Balance_last_6M | 25 | 50 | 0.06 |
Avg_Balance_last_6M | 51 | 75 | 0.05 |
Avg_Balance_last_6M | 76 | 100 | 0.07 |
Fee_1_3M | 0.005 | 0.009 | 0.007 |
Fee_1_3M | 0.01 | 0.03 | 0.06 |
Fee_1_3M | 0.03 | 0.05 | 0.02 |
Fee_1_3M | 0.05 | 0.07 | 0.03 |
so make a dataset (rules) from this data. In this example I'm assuming that you are having your original data in dataset "source".
Data _null_;
set rules end=last;
if _n_=1 then call execute ('Data source; set source;');
call execute ('if '||strip(variable)||' GE '||strip(min_bin) ||' then '||strip(variable)||'_SC = '||strip(score) ||';');
if last then call execute (' run;');
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.