Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

Interactive Grouping - documentation

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 8
Accepted Solution

Interactive Grouping - documentation

Hello, 

Could anybody help me with finding the documentation for the Interactive Grouping node in the SAS miner? I am interested in details, like how does the algorithm work precisely. As far as I know, the continuous variable is first binned, after which a decision tree is applied. In particular, I am interested in the decision tree algorithm applied? Also, is the decision tree run on the allready transformed WOE values, or it is run on the original variable?

Finally, might be a long shot but: Is it possible to interact this method with other variables to get interaction?

 

Best,

Marin


Accepted Solutions
Solution
‎01-17-2017 07:20 AM
SAS Super FREQ
Posts: 306

Re: Interactive Grouping - documentation

The EM Reference Help for this node (under SAS Credit Scoring) provides a good amount of detail.  You are correct that the continuous (interval) inputs are first binned via bucket or quantile binning, then those bins are further grouped using either PROC ARBOR or PROC OPTBIN (information about the constrained optimal binning here: http://www2.sas.com/proceedings/forum2008/153-2008.pdf), both using the bins themselves, not the WOE for the bins.

 

Here is a paragraph from the Reference Help that might be useful:

 

After the interval variables have been pre-binned, a decision tree model is fitted for each characteristic. PROC ARBOR or PROC OPTBIN (if constrained optimal) is used to produce the groups. You can choose among four grouping methods: optimal criterion, quantile, monotonic event rate, and constrained optimal. The optimal criterion method uses one of two criteria: reduction in entropy measure or the p-value of the Pearson Chi-square statistic. The quantile method generates groups with approximately the same frequency in each group. The monotonic event rate method generates groups that result in a monotonic distribution of event rates across all attributes. The event rate is equal to P(event | attribute). This is the conditional probability of an event given that an applicant exhibits a particular attribute. The constrained optimal method finds an optimal set of groups and simultaneously imposes additional constraints, as specified in the node property panel settings.

View solution in original post


All Replies
Solution
‎01-17-2017 07:20 AM
SAS Super FREQ
Posts: 306

Re: Interactive Grouping - documentation

The EM Reference Help for this node (under SAS Credit Scoring) provides a good amount of detail.  You are correct that the continuous (interval) inputs are first binned via bucket or quantile binning, then those bins are further grouped using either PROC ARBOR or PROC OPTBIN (information about the constrained optimal binning here: http://www2.sas.com/proceedings/forum2008/153-2008.pdf), both using the bins themselves, not the WOE for the bins.

 

Here is a paragraph from the Reference Help that might be useful:

 

After the interval variables have been pre-binned, a decision tree model is fitted for each characteristic. PROC ARBOR or PROC OPTBIN (if constrained optimal) is used to produce the groups. You can choose among four grouping methods: optimal criterion, quantile, monotonic event rate, and constrained optimal. The optimal criterion method uses one of two criteria: reduction in entropy measure or the p-value of the Pearson Chi-square statistic. The quantile method generates groups with approximately the same frequency in each group. The monotonic event rate method generates groups that result in a monotonic distribution of event rates across all attributes. The event rate is equal to P(event | attribute). This is the conditional probability of an event given that an applicant exhibits a particular attribute. The constrained optimal method finds an optimal set of groups and simultaneously imposes additional constraints, as specified in the node property panel settings.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 1 reply
  • 828 views
  • 1 like
  • 2 in conversation