I published a paper in the Washington Users of SAS Software meeting in 2011 on the topic of discretization of a continuous variable using the Chi-Squared statistic, and I am sharing it with the SAS community.
I describe the taxonomy of discretization algorithms, give an example of ChiMerge, a predecessor of the ChiD algorithm that I developed, and present comparative results between the ChiD algorithm and the SAS Enterprise Miner 4.3 bucketing algorithm. I discuss the results of my comparison and conclude that the ChiD algorithm generates cutsets that are of similar quality to those computed by the Enterprise Miner decision tree algorithm.
The conference paper and the SAS code for the ChD algorithm are included as attachments.
Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.
Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.
The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.