Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

Gradient Boosting is performing worse than random - Help please

Reply
Occasional Contributor
Posts: 6

Gradient Boosting is performing worse than random - Help please

Hello SASers,

I am working on a project with a binary target.  The target distribution is 13.6% (event) vs 86.4% (non-event).  The decision tree, regression and gradient boosting models are scoring around a 19% missclassification on the validation data.  I have two questions, but here are some of the details of my process flow:

I tried using inverse priors with models' assessment statistic set to decision, but switched to missclassification after I realized the models performed marginally better under this setting.

Data partition node is set to 70% (train) and 30% (validation).

I tried oversampling event case to 33% of the data, but the missclassification rate rose to 20%.

First question:  If I oversample, does the 20% missclassification rate take into account that I oversampled (ie. the oversampling 20% misscalass. is worse than the non-oversampling 19% missclass)?  OR is the oversampled 20% missclass better than the non-oversampled 19% b/c the oversampled event was observed in 33% of the observations and 20% is clearly an improvement?  

Second question: Do y'all have any suggestions of what is casing the models to perform worse than random and how suggestions of how I may fix the problem? 

Thank y'all so much for your time.

Best,

RWB

Occasional Contributor
Posts: 6

Re: Gradient Boosting is performing worse than random - Help please

Posted in reply to Analyze_this

Oops, I made a rookie mistake.  I calculated the distribution from the histograms derived from the explore variable process and I forgot to change my settings from (Top,Default) to (Random,Max).  In actuality, the target distribution is around target distribution is 30% (event) vs 70% (non-event).  So the model's are adding to our prediction power.

I'm still curious about the first question I asked above.  I'll restate it:

First question:  If I oversample, does the 20% missclassification rate take into account that I oversampled (ie. the oversampling 20% misscalass. is worse than the non-oversampling 19% missclass)?  OR is the oversampled 20% missclass better than the non-oversampled 19% b/c the oversampled event was observed in 33% of the observations and 20% is clearly an improvement?


If y'all could help me solve this one, that would be great.


Thank you.



SAS Super FREQ
Posts: 306

Re: Gradient Boosting is performing worse than random - Help please

Posted in reply to Analyze_this

No, oversampling is not being accounted for unless you adjust your prior probabilities and/or decision matrix, either in the Input Data node or a Decisions node after you have sampled.  The "Detecting Rare Classes" section under Analytics > Predictive Modeling in the Enterprise Miner Reference Help provides the best practices for handling rare events.

Hope that helps,

Wendy Czika

SAS Enterprise Miner R&D

Occasional Contributor
Posts: 6

Re: Gradient Boosting is performing worse than random - Help please

Posted in reply to WendyCzika

Thank you Wendy.  I'm using inverse priors in the decision matrix, so would the miss classification rate of, lets say a decision tree take into account that the data is sampled.  Here's the situation driving my question:  In situations where I deal with rare events (event happens in 5% of data), I'll sometimes get a missclass. rate of lets say,15% on validation data.  I then try oversampling (w/inverse priors of course), increasing the event proportion from 5% to (10%, or 20%, or 30%, ect.) and I end up getting missclass rates higher than the original 15%.  Is there a way to compare against different subsampling proportions?  SAS's training material usually suggests oversampling in situations of rare events, but I've been experiencing worse results when I do this.

SAS Super FREQ
Posts: 306

Re: Gradient Boosting is performing worse than random - Help please

Posted in reply to Analyze_this

I'm unclear about what you are doing exactly when you say oversampling with inverse priors.  If you are using the Sample node to sample a higher proportion of rare events, then you would need a Decisions node following it to adjust the prior probabilities.  When using the same prior probabilities, it is valid to compare the models with different event proportions from oversampling.  The "Prior Probabilities" section of the same part of the EM Reference Help that I mentioned above explains this better than I can!

Ask a Question
Discussion stats
  • 4 replies
  • 700 views
  • 3 likes
  • 2 in conversation