I don't believe the model is predicting the values (negatives) poorly as the observations' actual values are usually 0 or near 0. The negative predictions seem to be the result of the gradient boosting node's underlying algorithm. Since the boosting node constructs an additive regression model by sequentially fitting a base-learner to current pseudo residuals at each iteration, the final model is linear. This explains why the negatives are occurring despite there being no actual negative values. I believe that performing a log transformation prior to modeling (like I tried) is not possible being that the the final linear model is based on the pseudo-residuals of the base-learning trees. I guess I was looking for further explanation on this, and if another transformation may work to force positive predictions (I don't think so). Last resort, would be to truncate negatives to 0, but that seems like it may be the only option. Thank you.
... View more