ajosh,
Modeling rare events (which is actually quite common) is often challenging for several reasons:
* The null model is highly accurate (2% response rate means any model assigning all to the nonevent is 98% accurate)
* Failing to put any additional weight on correctly predicting the rare event can lead to a null model (for the reasons above)
* Increasing the weight on correctly predicting the rare event results in picking far more observations having the event than actually do
It might be helpful to separate the tasks of modeling an outcome and taking action on the outcome. When modeling a rare event, you must often either oversample the rare event, add weight to correctly predicting the rare event, choose a model selection criteria that is not based on the classification, or some combination of these. For reason stated above, misclassification is typically not a good selection criteria for modeling. SAS Enterprise Miner always provides a classification based on which outcome is most likely. When a target profile is created and decision weights are employed, SAS Enterprise Miner will also create variables containing the most profitable outcome based on the target profile you created. The meaningfulness of that prediction is directly related to the applicability of the target profile weights.
In general, modeling itself is more clear cut in that each analyst can pick and choose their criteria for building the 'best' model and then build the model. The resulting probabilities can then be used to order the resulting observations. Unfortunately for decision tree models, all of the observations in a single node are given the same score which is why some people run additional models within each terminal node to further separate the observations. The choice of what to do with the ordered observations typically involves business decisioning. The choice to investigate fraud can be costly, particularly if the person investigated is an honest loyal customer who just had an unusual situation. The amount of money at stake, the customer's longevity/profitability with the business, and the future expected value of the customer are just a few things that might be considered. This business decisioning usually creates far more complex criteria than can be simplified to a misclassification matrix which does not take the amount of money at risk into account.
Simply put, whether you take the default decision based on the most likely outcome (typically inappropriate in a rare event), use the decision-weighted predicted outcome (assuming the decision profile accurately represents the business decisioning), or use some other strategy for selecting cases to investigate (based on available resources, amount at risk, likelihood of fraud, etc...), the TP and FP come from the strategy you employ. I clearly advocate business decisioning in determining how to proceed because the simple classification rate itself is not meaningful enough in rare events. Even looking at the expected value of money at risk (e.g. the product of the probability of fraud and the amount at risk) will yield a different ordering of observations. So there isn't a great answer to the question which cutoff to use without fully understanding the business objectives and priorities. I tend to use some oversampling (but not to 50/50 because it under-represents the non-event) and decision weights with priors to allow variable selection and to get reasonable probabilities but then combine those probabilities with other information to determine the final prioritization/action for observations based on some more complex rules.
... View more