BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
munitech4u
Quartz | Level 8

Is there a way we can tweak the GBM in sas EM to implement extreme gradient boosting algorithm? 

 

Further, what is the best way to control overfitting in GBM using EM?

1 ACCEPTED SOLUTION

Accepted Solutions
JasonXin
SAS Employee
Hi, With Viya, it is possible for you to submit R or Python models to run in-memory by some API facilities, meaning you may not have to sample it down to test your GB or XGB. ~Q3 of 2016, you should see first batch of Viya ML released. That contains a Xboost. I don't think that has regularization. One better way to curtail over-fiting is actually just to score the model on external data sets and focus on 'prunning'. This is more analytical than rolling around regularization. Regularization, my personal opinion, is 'lazy man's trick. If you are worried about being replaced by machines in the future, then try to resort to regularization less. Regularization 'performance' on large simulated data should not prepare you towards believing the performance shall sustain onto wide, tall table that contains true complex (human) relationships. Hopefully, when you get on platforms like Viya, you will see for yourself. Return to EM. 1. EM does not have Xboost. 2. Recommend not to use OS integration node. If your true intention is to incorporate R GB models, score the R model, use Model Import Node. The node only requires that you provide the (same) target variable + the R model score. Then the Model Comparison node will include the R model in its performance grid.

View solution in original post

6 REPLIES 6
FriedEgg
SAS Employee

It could be done using the Open Source Integration Node. Extreme Gradient Boosting is not something available from SAS, currently. It is certainly something I hope they add in the near future though.

 

EDIT:

 

I feel like pointing out here that the main appeal of XGBoost is it's performance from an engineering standpoint, rather than statistical.  This is what really sets this pacakage apart from the GBM from SAS.

 

 

As far as overfitting is concerned, two traditional methods would be reducing the number of iterations as well as adjusting your subsample size.

rogerjdeangelis
Barite | Level 11

 

There are pakages in R to do this, not sure about Python

 

If you have IML there is a interface to R, also WPS has an interface.

 

You might check xgboost.

 

I am out of my comfort zone on this reply

 

xgboost: eXtreme Gradient Boosting

 

However, I would switch to SAS when it is available, as long as SAS nakes it part of stat. The R packages are the wild west of programming and this is not

like the Atkinson and Whittaker functions in previous posts. You can examine the R source code.

FriedEgg
SAS Employee

SAS Viya includes a distributed gradiant boosting proc which implements a very similar algorihm to xgboost

JasonXin
SAS Employee
Hi, With Viya, it is possible for you to submit R or Python models to run in-memory by some API facilities, meaning you may not have to sample it down to test your GB or XGB. ~Q3 of 2016, you should see first batch of Viya ML released. That contains a Xboost. I don't think that has regularization. One better way to curtail over-fiting is actually just to score the model on external data sets and focus on 'prunning'. This is more analytical than rolling around regularization. Regularization, my personal opinion, is 'lazy man's trick. If you are worried about being replaced by machines in the future, then try to resort to regularization less. Regularization 'performance' on large simulated data should not prepare you towards believing the performance shall sustain onto wide, tall table that contains true complex (human) relationships. Hopefully, when you get on platforms like Viya, you will see for yourself. Return to EM. 1. EM does not have Xboost. 2. Recommend not to use OS integration node. If your true intention is to incorporate R GB models, score the R model, use Model Import Node. The node only requires that you provide the (same) target variable + the R model score. Then the Model Comparison node will include the R model in its performance grid.
munitech4u
Quartz | Level 8
hmm.. So SAS Viya seems to be the game changer. I was hoping that SAS release some open source thing in future.
RicardoGalante
Calcite | Level 5

Hello Experts,

 

Do you have an example of how running a R code with Extreme Gradient Boosting (xgboost) is SAS EMiner using the Open Source Integration node?

 

Thanks  a lot!

 

 

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 17983 views
  • 0 likes
  • 5 in conversation