BookmarkSubscribeRSS Feed
JTho
Calcite | Level 5

I am using HPSPLIT and working with very highly imbalanced database (3% had "event"). In this case, events are considered extremely costly so we are willing to trade off specificity (false positives) for sensitivity (false negatives). I have tried balancing the data (undersample non-events), but we are still missing too many events.

 

Is there a more direct way of modifying the model to reflect the "high cost" of missing events? Priors, cost weights, etc?

 

Thank you.

 

1 REPLY 1
DougWielenga
SAS Employee

If you have SAS Enterprise Miner, you can incorporate decision weights into the target profile and/or you can choose options in the Decision Tree node that will allow the models to be assessed on just a portion of the data (e.g. the top decile).   HPSPLIT does not currently have that functionality but a WEIGHT statement is planned for a future release that would allow you to specify a variable that assigns more weight to the desired target observations.   Alternatively, you could try and oversample somewhat to generate a data set with more balance that might generate a more useful model.   


Hope this helps!

Doug

 

  

sas-innovate-2026-white.png



April 27 – 30 | Gaylord Texan | Grapevine, Texas

Registration is open

Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!

Register now

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 1678 views
  • 0 likes
  • 2 in conversation