BookmarkSubscribeRSS Feed
JTho
Calcite | Level 5

I am using HPSPLIT and working with very highly imbalanced database (3% had "event"). In this case, events are considered extremely costly so we are willing to trade off specificity (false positives) for sensitivity (false negatives). I have tried balancing the data (undersample non-events), but we are still missing too many events.

 

Is there a more direct way of modifying the model to reflect the "high cost" of missing events? Priors, cost weights, etc?

 

Thank you.

 

1 REPLY 1
DougWielenga
SAS Employee

If you have SAS Enterprise Miner, you can incorporate decision weights into the target profile and/or you can choose options in the Decision Tree node that will allow the models to be assessed on just a portion of the data (e.g. the top decile).   HPSPLIT does not currently have that functionality but a WEIGHT statement is planned for a future release that would allow you to specify a variable that assigns more weight to the desired target observations.   Alternatively, you could try and oversample somewhat to generate a data set with more balance that might generate a more useful model.   


Hope this helps!

Doug

 

  

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 1049 views
  • 0 likes
  • 2 in conversation