06-05-2017 01:04 PM - edited 06-05-2017 01:05 PM
I am using HPSPLIT and working with very highly imbalanced database (3% had "event"). In this case, events are considered extremely costly so we are willing to trade off specificity (false positives) for sensitivity (false negatives). I have tried balancing the data (undersample non-events), but we are still missing too many events.
Is there a more direct way of modifying the model to reflect the "high cost" of missing events? Priors, cost weights, etc?
2 weeks ago - last edited 2 weeks ago
If you have SAS Enterprise Miner, you can incorporate decision weights into the target profile and/or you can choose options in the Decision Tree node that will allow the models to be assessed on just a portion of the data (e.g. the top decile). HPSPLIT does not currently have that functionality but a WEIGHT statement is planned for a future release that would allow you to specify a variable that assigns more weight to the desired target observations. Alternatively, you could try and oversample somewhat to generate a data set with more balance that might generate a more useful model.
Hope this helps!