Hi,
I have a large dataset (c.350k records( with unique identifier (ID), target and c. 1000 variables for which I need to calculate an information value.
Ideally, I would like to get a list of the variables with WOE and information value attached so I can start filtering.
Does anybody have any code that will allow me to do this efficiently in SAS?
Thanks
I recall attending a presentation on this topic at SAS Global Forum and SESUG:
Lin 2013: https://support.sas.com/resources/papers/proceedings13/095-2013.pdf
Lin 2015: https://support.sas.com/resources/papers/proceedings15/3242-2015.pdf
Also, PROC HPBIN can compute the weight of evidence and the information value
I could suggest using PROC HPGENSELECT or PROC PLS to select the most 30 significant variables,
and get these 30 vaiables's WOE and IV.
If you want get better WOE and maximize IV to let your Score Card better and stronger ,
I wrote a paper about it , but you need SAS/IML and would cost you many time.
If you have SAS/EM's Score Card node ,that is your first choice.
"Get Better Weight of Evidence for Scorecards Using a Genetic Algorithm"
https://www.sas.com/content/dam/SAS/support/en/sas-global-forum-proceedings/2018/1808-2018.pdf
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.