Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

Random Forests in Enterprise Miner

Accepted Solution Solved
Reply
Contributor
Posts: 46
Accepted Solution

Random Forests in Enterprise Miner

Via the settings in the decision tree node, is it possible to mimic random forests? I've read SAS help, Applied Analytics Using SAS Enterprise Miner, and done a google search, but I'm not getting very far.


( I think I can do k-fold cross validation using the cross validation settings for the decision tree, but I'm not sure I'm doing it correctly)

Any suggestions or references? I feel like I've got a good basic feel for enterprise miner, and a decent theoretical background in various machine learning techniques (I've read a lot of Elements of Statistical Learning: Data Mining, Inference and Prediction (http://www-stat.stanford.edu/~tibs/ElemStatLearn/) and wathced Andrew Ng's Machine learning lectures (http://www.youtube.com/view_play_list?p=A89DCFA6ADACE599).

I need some more advanced references using Enterprise Miner.

Thanks.

Accepted Solutions
Solution
‎07-25-2016 02:20 PM
Community Manager
Posts: 2,761

Re: Random Forests in Enterprise Miner

Since the time of this original post (over 5 years!), SAS Enterprise Miner has added deep support for random forests, including an HP Forest node. 

 

See Getting the most from your Random Forests in SAS Enterprise Miner.  Also, watch this YouTube video about Random Forest and Support Vector Machines.

 

You might also want to read this paper about ensemble models in SAS Enterprise Miner.  From the abstract: 

  • Ensemble models combine two or more models to enable a more robust prediction, classification, or variable selection. This paper describes three types of ensemble models: boosting, bagging, and model averaging. It discusses go-to methods, such as gradient boosting and random forest, and newer methods, such as rotational forest and fuzzy clustering.

 

 

 

 

View solution in original post

http://communities.sas.com/data-mining Brett Wujek talks about tuning random forest and support vector machine algorithms to train high quality models. JOIN THE SAS DATA MINING COMMUNITY SAS Support Communities help users: Ask, Find and Share SAS knowledge. Join the SAS Data Mining Community ...

All Replies
Occasional Contributor
Posts: 7

Random Forests in Enterprise Miner

Does the new version of Miner perform random forests?

http://support.sas.com/documentation/cdl/en/whatsnew/64209/HTML/default/viewer.htm#emwhatsnew71m1.ht...

"New procedures cover data binning, imputation, sampling, decisions, logistic and linear regressions, neural networks, random forests"

Contributor
Posts: 46

Re: Random Forests in Enterprise Miner

It has been a while since I inquired about this, but I found that gradient boosting was very useful! Thanks.

Occasional Contributor
Posts: 17

Re: Random Forests in Enterprise Miner

In EM 7.1, try PROC FOREST, which conducts random Forest in SAS EM. Unfortunately, SAS doesn't release the syntax or detailed documents. If you have EM7.1, you need to use the code generating function to peek into the secretes.

One advantage of random forest is that it is very easily to be parallelized by user. I can build a random forest of 2000 small trees by firing 4 sessions simultaneously, each building 500 small ones.

SAS Employee
Posts: 122

Re: Random Forests in Enterprise Miner

In the major release SAS had in August 2012, EM has a random forest node. Its latest version EM is 12.2 with HPFOREST node which essentially runs its PROC HPFOREST in its High Performance Analytics offerings.

I posted in my blog Analytics in Writing several use examples on HPFOREST node and PROC HPFOREST.

Random forest modeling typically requires a lot of memory. In large-scale predictive learning world there are people who invest in building in-memory models and modes of modeling, vs. others who invest in 'smart finesses' such as MapReduce. In in-meory modes of applications, for example, for the sake of building a random forest, often 1.5 TB RAM, distributed across parallel worker nodes, is not considered LARGE or MUCH.

I once saw SAS programmers writing SAS Base to build random forests. Over 10 years ago, I first saw Salford System's offering, which typically ran on smaller data sets. Naturally associated with complexity is big data set. This is where random forest is supposed to 'shine', but learning algorithms from papers is one thing. Industralizing it on large scale is entirely different game. I have used SAS HPFOREST capabilities for a while. I believe it is still generation ONE, but has crossed critical threshold into industralization.

Community Manager
Posts: 486

Re: Random Forests in Enterprise Miner

I just came across this animated video on HPFOREST showing an example of how it may work in the academic space. While not getting into detail, it's a quick and artful watch.

Animating Analytics: PROC HPFOREST - YouTube

Anna

N/A
Posts: 1

Re: Random Forests in Enterprise Miner

Hi Jason,

I found the articles on your website really helpful. Do you have any documentation relating to PROC HPFOREST which you can email me?

SAS Super FREQ
Posts: 272

Re: Random Forests in Enterprise Miner

Please contact Tech Support (Technical Support Form) to get access to the secure HP procedure documentation that is available from the link:

http://support.sas.com/documentation/onlinedoc/miner/

Contributor
Posts: 36

Re: Random Forests in Enterprise Miner

Hi David,

Is there any reference about Gradient Boosting? Thanks

SAS Employee
Posts: 122

Re: Random Forests in Enterprise Miner

If you have access to EM product or product documentation, gradient boosting details are just under the Gradient Boosting node. The PROC version of GB is proc treeboost. I think the details of all the PROCS behind EM are now public at SAS support site, although the official policy remains 'as it is' meaning not supported by SAS technical support. I know the high performance version of GB is under construction. No info when it will be ready.

Jason Xin

Solution
‎07-25-2016 02:20 PM
Community Manager
Posts: 2,761

Re: Random Forests in Enterprise Miner

Since the time of this original post (over 5 years!), SAS Enterprise Miner has added deep support for random forests, including an HP Forest node. 

 

See Getting the most from your Random Forests in SAS Enterprise Miner.  Also, watch this YouTube video about Random Forest and Support Vector Machines.

 

You might also want to read this paper about ensemble models in SAS Enterprise Miner.  From the abstract: 

  • Ensemble models combine two or more models to enable a more robust prediction, classification, or variable selection. This paper describes three types of ensemble models: boosting, bagging, and model averaging. It discusses go-to methods, such as gradient boosting and random forest, and newer methods, such as rotational forest and fuzzy clustering.

 

 

 

 

http://communities.sas.com/data-mining Brett Wujek talks about tuning random forest and support vector machine algorithms to train high quality models. JOIN THE SAS DATA MINING COMMUNITY SAS Support Communities help users: Ask, Find and Share SAS knowledge. Join the SAS Data Mining Community ...
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 10 replies
  • 15084 views
  • 5 likes
  • 9 in conversation