BookmarkSubscribeRSS Feed
KonstantinV1
Calcite | Level 5

 

I have a categorical variable Q1 taking the values from 1 to 4 and a binary variable Bad taking the values 1 and 0.

 

I want to calculate WoE and IV for the variable according to the existing categories in ascending order.

 

I know that HPBIN has option to calculate those statistics but it would also create other bins - i want to use the variable binned as it is. I have also tried to use the output of the Freq procedure, but it becomes inconvenient for the cases in which there is no events in some of the categories.

1 REPLY 1
Ksharp
Super User

Do you want combine these into other group ? or just use its original levels ?

 



%let var=marital   ;


title "变量: &var";
proc sql;
create table woe_&var as
 select &var as group,
sum(good_bad='bad') as n_bad label='bad的个数',sum(good_bad='good') as n_good label='good的个数',
sum(good_bad='bad')/(select sum(good_bad='bad') from have ) as bad_dist  format=percent7.2 label='bad的占比',
sum(good_bad='good')/(select sum(good_bad='good') from have ) as good_dist  format=percent7.2 label='good的占比',
log(calculated Bad_Dist/calculated Good_Dist) as woe
from have
   group by &var
    order by woe;


select *,sum(  (Bad_Dist-Good_Dist)*woe  ) as iv
 from woe_&var ;

quit;

 

https://www.sas.com/content/dam/SAS/support/en/sas-global-forum-proceedings/2018/1808-2018.pdf

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 2664 views
  • 0 likes
  • 2 in conversation