When doing a logistic regression, you can compute the Gains, Lift, %response, but also the ROC Curve and associated statistics of Gini and AUC. Do you have any advice on when to use each of these statistics to evaluate the quality of the model fit?
(Please note what follows is my personal opinion based on the modeling scenarios I have encountered, but opinions might vary greatly among modelers based on their own experience and objectives)
There are several things to consider when fitting a model, and how you want to use the model is critical to informing how you should select the model itself.
* Do you have specific objectives in how the model will be used? For example, if my goal is to make a business decision on certain observations (e.g. choose a specific strategy for each observation/account), I might be more concerned about how effective my strategies are on select portions of the population than on how the model fits overall. If I am dealing with a rare event, I am typically most interested in how the model performs on the (likely) small percent of the population on which I take action. Lift/Gain are calculated at specific percentages of the population allowing me to evaluate strategy effectiveness at any given depth while Gini and AUC (area under the curve) assess overall model performance across the entire population. In some cases, however, interpretation is critical which begs another question.
* Do you need the model to be interpretable? Should you have a need to interpret your fitted model, you will find yourself being forced to choose from a subset of modeling approaches in order to obtain this interpretation. You should still consider more flexible modeling strategies, however, since the performance on these other models can give you an idea how much performance you are sacrificing for interpretability. Your choice in this case is more challenging since decision trees are often considered when interpretability is desired, but trees don't lend themselves to smoothly changing metrics. Trees have a relatively small number of distinct predicted values from their terminal nodes, and every observation in the same node has the same predicted value. At times, some have chosen to apply secondary models to try and better sort the observations within a node to overcome this. If you have a Regression model, however, you can choose any distinct predicted value in the data as your cutoff value and there are typically not large blocks of observations with exactly the same score among the training observations. Suppose the highest response rate occurred in terminal tree nodes with 12.3%, 6.4% and 4.8% of the data. Lift charts are computed by evaluating bins (e.g. every 5% grouping) but this means the terminal nodes from the tree do not fit nicely into the bins.
* Do you have profit/cost information that should be considered? You can specify decision weights for categorical target variable that allow you to evaluate the most profitable outcome. In this case, many of the metrics you mentioned might be less critical than the projected revenue/profit from making certain decisions, even if those decisions are more likely to be false. Suppose you had data with a 1% fraud rate. A person 10 times as likely to be fraudulent still only has 10% chance of fraudulent even though there is a 90% chance of not being fraudulent. If the cost of fraud is in the tens of thousands and the cost of investigation is relatively small, you might consider investigating this person who is 90% likely to be non-fraudulent. These types of considerations are not considered by Lift/Gain or Gini/AUC.
It is best to put think of how these different metrics can be balanced against the business objectives. Are you more concerned about paying out huge fraud costs or are you more concerned about angering your non-fraudulent clientele and potentially losing business. As you can see, choosing the best criterion can be difficult at times.
Hope this helps!
Cordially,
Doug
... View more