I published a paper in the Washington Users of SAS Software meeting in 2011 on the topic of discretization of a continuous variable using the Chi-Squared statistic, and I am sharing it with the SAS community.
I describe the taxonomy of discretization algorithms, give an example of ChiMerge, a predecessor of the ChiD algorithm that I developed, and present comparative results between the ChiD algorithm and the SAS Enterprise Miner 4.3 bucketing algorithm. I discuss the results of my comparison and conclude that the ChiD algorithm generates cutsets that are of similar quality to those computed by the Enterprise Miner decision tree algorithm.
The conference paper and the SAS code for the ChD algorithm are included as attachments.
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.