I published a paper in the Washington Users of SAS Software meeting in 2011 on the topic of discretization of a continuous variable using the Chi-Squared statistic, and I am sharing it with the SAS community.
I describe the taxonomy of discretization algorithms, give an example of ChiMerge, a predecessor of the ChiD algorithm that I developed, and present comparative results between the ChiD algorithm and the SAS Enterprise Miner 4.3 bucketing algorithm. I discuss the results of my comparison and conclude that the ChiD algorithm generates cutsets that are of similar quality to those computed by the Enterprise Miner decision tree algorithm.
The conference paper and the SAS code for the ChD algorithm are included as attachments.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.