We’re smarter together. Learn from this collection of community knowledge and add your expertise.

Ensemble Models and Partitioning Algorithms in SAS® Enterprise Miner - Ask the Expert Q&A

by SAS Employee MelodieRush on ‎06-07-2017 03:35 PM (1,412 Views)

Did you miss the Ask the Expert session on Ensemble Models and Partitioning Algorithms in SAS® Enterprise Miner? Not to worry, you can catch it on-demand at your leisure.
Watch the webinar
The session covers Ensemble Models and Partitioning Algorithms in SAS® Enterprise Miner. The session covers:


  • An introduction to ensemble models and why they can be a valuable tool for predictive modeling
  • A review of decision trees and reveal a feature that makes partitioning algorithms such effective candidates for ensemble techniques
  • Define Bagging and Boosting
  • Discuss advantages and disadvantages for the following ensemble methods available in SAS Enterprise Miner
                      ○ Gradient Boosting
                      ○ Random Forests
                      ○ Stacked Ensembles


Here are some highlighted questions from the Q&A segment held at the end of the session for ease of reference.

Q: Can I use all model nodes with the Ensemble Node?

A: In SAS Enterprise Miner 14.2 the Ensemble node only supports the modeling nodes that generate score code in DATA step format. Not Memory Based Reasoning, HP Forest or HP Text Miner

Q: What if I have an interval target variable, can I use the Ensemble Node with it?

A: Yes, Ensemble Node works with either an interval target or categorical target variable

Q: Is there a maximum number of models that can be ensemble?

A: No there is no maximum, must have 1 or more model nodes proceeding the ensemble node.

Q: How does the voting combination method work for an interval target?

A: The voting method is only available for categorical target variables. When you use the voting method to compute the posterior probabilities, two methods are available for voting the posterior probabilities: Average and Proportion.

Q: When you get the end group, is the bootstrap samples already combined and averaged?

A: Yes. The End Groups node will function as a model node and present the final aggregated model.

Q: For Stacked Ensembles, do you first run all 4 models independently to pick the best model from each then merge?

A: Yes, then you merge the predictions for the 4 models and model using the predictions as inputs.

Q: How do we know which ensemble approach(average/stacking/cluster-based) we should use for the certain situation?

A: The great news with SAS Enterprise Miner you can use all and see which one works best for your data in your situation.

Q: What are your suggestions about avoiding overfitting?

A: The best way to avoid overfitting is to use a holdout sample to validate the model on data that was not used for training.

Q: I realize this webinar is about Enterprise Miner, but can we do similar things in Enterprise Guide, and which one has greater market presence?

A: With Enterprise Guide you could program to accomplish some of the same ensemble techniques but it would be fairly complex. Gradient Boosting, Random Forest and Neural Networks are not available in SAS/Stat, so would not be available in Enterprise Guide unless you have licensed Enterprise Miner (or Viya products that include these algorithms) and use the procedures available in EM (or Viya)

Want more tips? Be sure to subscribe to the Data Mining Library to receive follow up Q/A, slides and other related resources from the webinar. From the Data Mining Library, just click Subscribe from the orange bar underneath the list of the recent articles.

Your turn
Sign In!

Want to write an article? Sign in with your profile.