BookmarkSubscribeRSS Feed
Tenno1
Calcite | Level 5

I developed an algorithm that uses the chi-squared test to perform supervised discretization of a continuous variable. I described it in the paper "ChiD-A Chi-Squared Discretization Algorithm" published in the WUSS 2011 Proceedings available at http://www.wuss.org/proceedings11/

The stopping criterion is not very intelligent, and I would like to know if there are better ways of stopping the discretization process.

2 REPLIES 2
art297
Opal | Level 21

I should begin by admitting that I am not a statistician and am not familiar with either the method you are using or with IML.

That said, when I have confronted situations where I needed to incorporate a somewhat intelligent stopping point, I found it useful to apply a rather brute force approach, namely to wrap the code within a macro that uses a binary decision tree to test various criteria until an acceptible limit is reached.

Of course, if you are asking what such a criterion might be, please just ignore this post.

Rick_SAS
SAS Super FREQ

Probably a good general approach. In IML you don't even need to use a macro. In IML you can just wrap the code in a module definition and call the module at each step of an iterative method.

For example, if you're trying to find a zero, see http://blogs.sas.com/content/iml/2011/08/03/finding-the-root-of-a-univariate-function/ Or, if you're trying to optimize some criterion, see http://blogs.sas.com/content/iml/2011/10/12/maximum-likelihood-estimation-in-sasiml/

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 1256 views
  • 0 likes
  • 3 in conversation