Has anyone used the output files from Proc Treeboost ?
We are using Proc TreeBoost to build models for our production environment. I use the output files from Proc TreeBoost to build my production scoring file for implementation. One of the parameters in the NOD file is GAMMA. I notice that sometimes the GAMMA in this dataset is divided by 10 when used in the SAS score code which can be output from the proc and sometimes it is divided by 100 .
I trying to figure out if the parameter that controls this in one of the output files from Proc TreeBoost:
MDL
NOD
RUL
Anyone have any insight? I appreciate your time.
The line in the PROC TREEBOOST score code should look something like,
_ARB_F_ + SHRINKAGE * GAMMA, where SHRINKAGE is the shrinkage parameter to PROC TREEBOOST and GAMMA is the GAMMA in the NODESTATS output dataset. Does that resolve it?
Hi Todd,
Are you talking about the score code or optimized score code produced by Gradient Boosting node?
Where do you see this gamma parameter?
The line in the PROC TREEBOOST score code should look something like,
_ARB_F_ + SHRINKAGE * GAMMA, where SHRINKAGE is the shrinkage parameter to PROC TREEBOOST and GAMMA is the GAMMA in the NODESTATS output dataset. Does that resolve it?
I'm glad you found your solution, todd8325! Can you "Accept the correct reply as a solution"? Or if one was particularly helpful, feel free to "Like" it. This will help other community members who may run into the same issue know what worked.
Thanks!
Anna
Hi, just started to use the PROC TREEBOOST and have a couple of questions:
1) I was not able to find how to output optimized score code using score statement. I was able to do Code statement to output score code.
2) is there any way in PROC TREEBOOST to use an existing score as a strong learner and build the treeboost algorithm on top of it.
Much appreciated for any suggestions. Thanks.
Hi. Optimized score code is produced by the Score node in Enterprise Miner.
Ray
Thanks. Ray.
Does anyone have experience to use GBM with segmentation vs GBM without segmentation? My intuition says with segmentation will always have better results. Any suggestions?
There is no way to tell PROC TREEBOOST to incorporate other learners.
That said, if the target Y has interval level of measurement then use a psuedo-target (Y-P) as input to PROC TREEBOOST, where P is the prediction from another learner. The final prediction is P_treeboost + P, where P_treeboost is the prediction from PROC TREEBOOST.
thanks a lot for the response. Unfortunately, my target is 1 or 0.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.