BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
todd8325
Calcite | Level 5

Has anyone used the output files from Proc Treeboost ? 

 

We are using Proc TreeBoost to build models for our production environment.   I use the output files from Proc TreeBoost to build my production scoring file for implementation.   One of the parameters in the NOD file is GAMMA.   I notice that sometimes the GAMMA in this dataset is divided by 10 when used in the SAS score code which can be output from the proc and sometimes it is divided by 100 .  

 

 I trying to figure out if the parameter that controls this in one of the output files from Proc TreeBoost:

 

MDL

NOD

RUL

 

Anyone have any insight?   I appreciate your time.

1 ACCEPTED SOLUTION

Accepted Solutions
PadraicGNeville
SAS Employee

The line in the PROC TREEBOOST score code should look something like,

_ARB_F_ + SHRINKAGE * GAMMA, where SHRINKAGE is the shrinkage parameter to PROC TREEBOOST and GAMMA is the GAMMA in the NODESTATS output dataset.  Does that resolve it?

View solution in original post

10 REPLIES 10
M_Maldonado
Barite | Level 11

Hi Todd,

Are you talking about the score code or optimized score code produced by Gradient Boosting node?

Where do you see this gamma parameter?

todd8325
Calcite | Level 5
Hi Miguel, Thank you for your response. Proc TreeBoost outputs several output files. In particular, I am looking at this one: NODESTATS Output Data Set In that dataset, there is a column called GAMMA . I can see how GAMMA is used in the optimized score code in the output from the SCORE Statement . However, it appears that in some models the value of GAMMA in the NODESTATS output dataset is divided by 10 in the score code and in some models the value of GAMMA in the NODESTATS output dataset is divided by 100 in the score code. I am trying to determine if there is a parameter in any of the output files from Proc TreeBoost that indicates the value that GAMMA is divided by.
PadraicGNeville
SAS Employee

The line in the PROC TREEBOOST score code should look something like,

_ARB_F_ + SHRINKAGE * GAMMA, where SHRINKAGE is the shrinkage parameter to PROC TREEBOOST and GAMMA is the GAMMA in the NODESTATS output dataset.  Does that resolve it?

todd8325
Calcite | Level 5
Yes it does, thank you!!
AnnaBrown
Community Manager

I'm glad you found your solution, todd8325! Can you "Accept the correct reply as a solution"? Or if one was particularly helpful, feel free to "Like" it. This will help other community members who may run into the same issue know what worked.

Thanks!
Anna


Join us for SAS Community Trivia
SAS Bowl XXIX, The SAS Hackathon
Wednesday, March 8, 2023, at 10 AM ET | #SASBowl

asnster
Calcite | Level 5

Hi, just started to use the PROC TREEBOOST and have a couple of questions:

 

1) I was not able to find how to output optimized score code using score statement.  I was able to do Code statement to output score code.

2) is there any way in PROC TREEBOOST to use an existing score as a strong learner and build the treeboost algorithm on top of it.

 

Much appreciated for any suggestions.  Thanks.

rayIII
SAS Employee

Hi. Optimized score code is produced by the Score node in Enterprise Miner. 

 

https://communities.sas.com/t5/SAS-Communities-Library/Scoring-Series-Part-3-Enterprise-Miner-Optimi...

 

Ray

asnster
Calcite | Level 5

Thanks. Ray.

 

Does anyone have experience to use GBM with segmentation vs GBM without segmentation? My intuition says with segmentation will always have better results. Any suggestions?

PadraicGNeville
SAS Employee

There is no way to tell PROC TREEBOOST to incorporate other learners. 

 

That said, if the target Y has interval level of measurement then use a psuedo-target (Y-P) as input to PROC TREEBOOST, where P is the prediction from another learner.  The final prediction is P_treeboost + P, where P_treeboost is the prediction from PROC TREEBOOST.

 

 

asnster
Calcite | Level 5

thanks a lot for the response. Unfortunately, my target is 1 or 0.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 10 replies
  • 5551 views
  • 0 likes
  • 6 in conversation