BookmarkSubscribeRSS Feed
JTho
Calcite | Level 5

I am using HPSPLIT and working with very highly imbalanced database (3% had "event"). In this case, events are considered extremely costly so we are willing to trade off specificity (false positives) for sensitivity (false negatives). I have tried balancing the data (undersample non-events), but we are still missing too many events.

 

Is there a more direct way of modifying the model to reflect the "high cost" of missing events? Priors, cost weights, etc?

 

Thank you.

 

1 REPLY 1
DougWielenga
SAS Employee

If you have SAS Enterprise Miner, you can incorporate decision weights into the target profile and/or you can choose options in the Decision Tree node that will allow the models to be assessed on just a portion of the data (e.g. the top decile).   HPSPLIT does not currently have that functionality but a WEIGHT statement is planned for a future release that would allow you to specify a variable that assigns more weight to the desired target observations.   Alternatively, you could try and oversample somewhat to generate a data set with more balance that might generate a more useful model.   


Hope this helps!

Doug

 

  

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 1155 views
  • 0 likes
  • 2 in conversation