We’re smarter together. Learn from this collection of community knowledge and add your expertise.

Quick guide to implementing advanced ensemble methods in SAS Enterprise Miner

by SAS Super FREQ on ‎05-25-2016 01:41 PM (2,104 Views)

 

Download the Files (GitHub)

 

Last month at SAS Global Forum 2016, I presented the paper, Ensemble Modeling: Recent Advances and Applications, that I wrote along with my colleagues yeliu and M_Maldonado. In this paper, we shared a SAS Enterprise Miner subflow that can be incorporated into your predictive modeling flow to implement the following ensemble methods that take model performance into account: top-t, hill-climbing, clustering-based selection, and stacking methods. 

 

To make it even easier for you to take advantage of this subflow, we are putting two XML files representing ensemble flows on our GitHub site (where we have other templates to help get you started with various data mining topics - see this tip for more info): 

 

Ensemble Full Flow (predictive modeling portion)

EnsembleFullFlow.xml contains an entire predictive modeling and ensemble flow, comprising the “Common Practices” flow from our paper (shown above) connected to the ensemble subflow (shown below) so you can see, and run, the whole process.  

 

 

Ensemble Subflow

EnsembleSubflow.xml has just the ensemble portion of the flow that you can connect to an existing predictive modeling flow.  After importing this XML file into your project, you can copy the entire flow into the diagram that has your predictive modeling flow, connect the flows together, and run.

 

See the README file for instructions on how to import these XML files and quickly get started with these more sophisticated ensemble methods.

 

Note there are several nodes that directly create ensemble models in SAS Enterprise Miner, and they've been covered in previous SAS Global Forum papers:

 

  • The Ensemble node for simple averaging/voting/maximum of multiple models
  • The Start Group and End Group nodes for bagging and boosting
  • The Gradient Boosting node and HP Forest node for tree-based ensemble methods

See Leveraging Ensemble Models in SAS Enterprise Miner and The Power of the Group Processing Facility in SAS Enterprise Miner for more information.

Contributors
Your turn
Sign In!

Want to write an article? Sign in with your profile.