Re: Predicting Analytics on Big Data
What is the purpose/effect of option NBINS when used with statement DECISIONTREE (page 3-89 of course text)?
The reference reported on page 3.70 states: "specifies the number of bins to use in the calculation of the decision tree. The number of bins affects the accuracy of the tree"
What does "use in the calculation of the decision tree" mean? Does it affect how the splitting algorithm works?
My Answer:
In IMSTAT Decision Tree modeling during the split search, the interval input variables are binned and the optimal split point are determined using information gain. Users can choose the number of bins (2 to 10) to transform the interval input variables to binned variable.