BookmarkSubscribeRSS Feed
JFlyers00
Calcite | Level 5

Hi,  

 

I was wondering if there is a way to force the gradient boosting node to always produce a positive predictions for an interval target.  The target is not negative for the training data, and it could never be negative realisitcally.  

 

I understand why the model predicts negative values, and I have tried doing a log transformation and exponentiating the predictions.  That did not work and the resulting model was poor.  I can elaborate more if needed.  

 

Thank you,

James

2 REPLIES 2
PadraicGNeville
SAS Employee

In general, boosting produces predictions that are out of range because the model predicts those observations poorly (was that too obvious?).  Better predictions sometimes result from increasing the number of trees while decreasing the learning rate (the SHRINKAGE= parameter).  Other than that, no, there is no option within the boosting algorithm.  The predictions would have to be post-processed, perhaps simply by truncating negative predictions to 0.

-Padraic

JFlyers00
Calcite | Level 5

I don't believe the model is predicting the values (negatives) poorly as the observations' actual values are usually 0 or near 0.

The negative predictions seem to be the result of the gradient boosting node's underlying algorithm. Since the boosting node constructs an additive regression model by sequentially fitting a base-learner to current pseudo residuals at each iteration, the final model is linear. This explains why the negatives are occurring despite there being no actual negative values.

I believe that performing a log transformation prior to modeling (like I tried) is not possible being that the the final linear model is based on the pseudo-residuals of the base-learning trees. I guess I was looking for further explanation on this, and if another transformation may work to force positive predictions (I don't think so).

Last resort, would be to truncate negatives to 0, but that seems like it may be the only option.

 

Thank you.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 934 views
  • 0 likes
  • 2 in conversation