BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Gokirop
Fluorite | Level 6

Hello, 

Could anybody help me with finding the documentation for the Interactive Grouping node in the SAS miner? I am interested in details, like how does the algorithm work precisely. As far as I know, the continuous variable is first binned, after which a decision tree is applied. In particular, I am interested in the decision tree algorithm applied? Also, is the decision tree run on the allready transformed WOE values, or it is run on the original variable?

Finally, might be a long shot but: Is it possible to interact this method with other variables to get interaction?

 

Best,

Marin

1 ACCEPTED SOLUTION

Accepted Solutions
WendyCzika
SAS Employee

The EM Reference Help for this node (under SAS Credit Scoring) provides a good amount of detail.  You are correct that the continuous (interval) inputs are first binned via bucket or quantile binning, then those bins are further grouped using either PROC ARBOR or PROC OPTBIN (information about the constrained optimal binning here: http://www2.sas.com/proceedings/forum2008/153-2008.pdf), both using the bins themselves, not the WOE for the bins.

 

Here is a paragraph from the Reference Help that might be useful:

 

After the interval variables have been pre-binned, a decision tree model is fitted for each characteristic. PROC ARBOR or PROC OPTBIN (if constrained optimal) is used to produce the groups. You can choose among four grouping methods: optimal criterion, quantile, monotonic event rate, and constrained optimal. The optimal criterion method uses one of two criteria: reduction in entropy measure or the p-value of the Pearson Chi-square statistic. The quantile method generates groups with approximately the same frequency in each group. The monotonic event rate method generates groups that result in a monotonic distribution of event rates across all attributes. The event rate is equal to P(event | attribute). This is the conditional probability of an event given that an applicant exhibits a particular attribute. The constrained optimal method finds an optimal set of groups and simultaneously imposes additional constraints, as specified in the node property panel settings.

View solution in original post

1 REPLY 1
WendyCzika
SAS Employee

The EM Reference Help for this node (under SAS Credit Scoring) provides a good amount of detail.  You are correct that the continuous (interval) inputs are first binned via bucket or quantile binning, then those bins are further grouped using either PROC ARBOR or PROC OPTBIN (information about the constrained optimal binning here: http://www2.sas.com/proceedings/forum2008/153-2008.pdf), both using the bins themselves, not the WOE for the bins.

 

Here is a paragraph from the Reference Help that might be useful:

 

After the interval variables have been pre-binned, a decision tree model is fitted for each characteristic. PROC ARBOR or PROC OPTBIN (if constrained optimal) is used to produce the groups. You can choose among four grouping methods: optimal criterion, quantile, monotonic event rate, and constrained optimal. The optimal criterion method uses one of two criteria: reduction in entropy measure or the p-value of the Pearson Chi-square statistic. The quantile method generates groups with approximately the same frequency in each group. The monotonic event rate method generates groups that result in a monotonic distribution of event rates across all attributes. The event rate is equal to P(event | attribute). This is the conditional probability of an event given that an applicant exhibits a particular attribute. The constrained optimal method finds an optimal set of groups and simultaneously imposes additional constraints, as specified in the node property panel settings.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 5204 views
  • 1 like
  • 2 in conversation