BookmarkSubscribeRSS Feed
JTho
Calcite | Level 5

I am using HPSPLIT and working with very highly imbalanced database (3% had "event"). In this case, events are considered extremely costly so we are willing to trade off specificity (false positives) for sensitivity (false negatives). I have tried balancing the data (undersample non-events), but we are still missing too many events.

 

Is there a more direct way of modifying the model to reflect the "high cost" of missing events? Priors, cost weights, etc?

 

Thank you.

 

1 REPLY 1
DougWielenga
SAS Employee

If you have SAS Enterprise Miner, you can incorporate decision weights into the target profile and/or you can choose options in the Decision Tree node that will allow the models to be assessed on just a portion of the data (e.g. the top decile).   HPSPLIT does not currently have that functionality but a WEIGHT statement is planned for a future release that would allow you to specify a variable that assigns more weight to the desired target observations.   Alternatively, you could try and oversample somewhat to generate a data set with more balance that might generate a more useful model.   


Hope this helps!

Doug

 

  

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 1176 views
  • 0 likes
  • 2 in conversation