Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

Proc Treeboost

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 5
Accepted Solution

Proc Treeboost

Has anyone used the output files from Proc Treeboost ? 

 

We are using Proc TreeBoost to build models for our production environment.   I use the output files from Proc TreeBoost to build my production scoring file for implementation.   One of the parameters in the NOD file is GAMMA.   I notice that sometimes the GAMMA in this dataset is divided by 10 when used in the SAS score code which can be output from the proc and sometimes it is divided by 100 .  

 

 I trying to figure out if the parameter that controls this in one of the output files from Proc TreeBoost:

 

MDL

NOD

RUL

 

Anyone have any insight?   I appreciate your time.


Accepted Solutions
Solution
‎07-07-2017 02:44 PM
SAS Employee
Posts: 32

Re: Proc Treeboost

The line in the PROC TREEBOOST score code should look something like,

_ARB_F_ + SHRINKAGE * GAMMA, where SHRINKAGE is the shrinkage parameter to PROC TREEBOOST and GAMMA is the GAMMA in the NODESTATS output dataset.  Does that resolve it?

View solution in original post


All Replies
Super Contributor
Posts: 336

Re: Proc Treeboost

Hi Todd,

Are you talking about the score code or optimized score code produced by Gradient Boosting node?

Where do you see this gamma parameter?

Occasional Contributor
Posts: 5

Re: Proc Treeboost

Hi Miguel, Thank you for your response. Proc TreeBoost outputs several output files. In particular, I am looking at this one: NODESTATS Output Data Set In that dataset, there is a column called GAMMA . I can see how GAMMA is used in the optimized score code in the output from the SCORE Statement . However, it appears that in some models the value of GAMMA in the NODESTATS output dataset is divided by 10 in the score code and in some models the value of GAMMA in the NODESTATS output dataset is divided by 100 in the score code. I am trying to determine if there is a parameter in any of the output files from Proc TreeBoost that indicates the value that GAMMA is divided by.
Solution
‎07-07-2017 02:44 PM
SAS Employee
Posts: 32

Re: Proc Treeboost

The line in the PROC TREEBOOST score code should look something like,

_ARB_F_ + SHRINKAGE * GAMMA, where SHRINKAGE is the shrinkage parameter to PROC TREEBOOST and GAMMA is the GAMMA in the NODESTATS output dataset.  Does that resolve it?

Occasional Contributor
Posts: 5

Re: Proc Treeboost

Yes it does, thank you!!
Community Manager
Posts: 484

Re: Proc Treeboost

I'm glad you found your solution, todd8325! Can you "Accept the correct reply as a solution"? Or if one was particularly helpful, feel free to "Like" it. This will help other community members who may run into the same issue know what worked.

Thanks!
Anna

New Contributor
Posts: 4

Re: Proc Treeboost

Hi, just started to use the PROC TREEBOOST and have a couple of questions:

 

1) I was not able to find how to output optimized score code using score statement.  I was able to do Code statement to output score code.

2) is there any way in PROC TREEBOOST to use an existing score as a strong learner and build the treeboost algorithm on top of it.

 

Much appreciated for any suggestions.  Thanks.

SAS Employee
Posts: 106

Re: Proc Treeboost

Hi. Optimized score code is produced by the Score node in Enterprise Miner. 

 

https://communities.sas.com/t5/SAS-Communities-Library/Scoring-Series-Part-3-Enterprise-Miner-Optimi...

 

Ray

New Contributor
Posts: 4

Re: Proc Treeboost

Thanks. Ray.

 

Does anyone have experience to use GBM with segmentation vs GBM without segmentation? My intuition says with segmentation will always have better results. Any suggestions?

SAS Employee
Posts: 32

Re: Proc Treeboost

There is no way to tell PROC TREEBOOST to incorporate other learners. 

 

That said, if the target Y has interval level of measurement then use a psuedo-target (Y-P) as input to PROC TREEBOOST, where P is the prediction from another learner.  The final prediction is P_treeboost + P, where P_treeboost is the prediction from PROC TREEBOOST.

 

 

New Contributor
Posts: 4

Re: Proc Treeboost

thanks a lot for the response. Unfortunately, my target is 1 or 0.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 10 replies
  • 1803 views
  • 0 likes
  • 6 in conversation