## Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

Frequent Contributor
Posts: 95

Hi All,

I had a quick question, I have created several models, and I use the AUC and MAPE to assess them. MAPE is calculated as below. My question, is the AUC is not good but the MAPE looks OK? How is it possible?..Below are my thresholds and AUC , MAPE results (Table at the Bottom)..So what type of decision should I make? Shall I look at AUC only, or MAPE? Your help will be much appreciated. Thank you

thresholds

• AUC

0.6-0.7: Acceptable

0.71-0.8: Good

0.81-0.9: Excellent

• MAPE

0.2-0.3: Acceptable

0.2-0.1:Good

0.1-0.001: Excellent

MAPE Calculation = 1/N Sum (|Actual-Predicted|/|Actual|)*100

 Model Gini AUC MAPE Model 1 0.07 0.53 0 Model 2 0.14 0.57 0.0828 Model 3 0.37 0.69 0.0556 Model 4 0.08 0.55 0.0673 Model 5 0.01 0.51 0.0249 Model 6 0.09 0.55 0.1552 Model 7 0.18 0.59 0.2327 Model 8 0.16 0.58 0.0654 Model 9 0.13 0.57 0.0842 Model 10 0.14 0.57 0.0261 Model 11 0.35 0.68 0.1336 Model 12 0.16 0.58 0.1704 Model 13 0.07 0.54 0.0504 Model 14 0.11 0.56 0.096 Model 15 0.09 0.55 0.1478 Model 16 0.19 0.6 0.045 Model 17 0.18 0.59 0.0505 Model 18 0.16 0.59 0.1472 Model 19 0.17 0.59 0.1556
SAS Employee
Posts: 24

MAPE is usually for models with interval targets (regression, time series, etc.) and not appropriate for scenarios where the actual values can be 0, as this could cause a division by 0 during the MAPE calculation.

Mean absolute percentage error - Wikipedia, the free encyclopedia

AUC is typically for binary classifiers like logistic regression.

Receiver operating characteristic - Wikipedia, the free encyclopedia

Do you have an interval or binary target?

If you have a binary target, what is the event occurrence rate for your target? The situation you describe is common for rare target event occurrences.

To increase your  c-statistic/AUC for rare targets:

- Disproportionately over-sample the rare events

- Add a weight to the rare events

- Use an inverse prior distribution

Frequent Contributor
Posts: 95

Hi Patrick,

Thank you for your response...My target is binary and I have used Logistic Regression to build the model...The response rate varies, some models will have 50%, other 20%, other 10%.. and the lowest has around

So the response rate is not that rare...so Are you saying that for a binary target, I shouldn't use MAPE? What should I use then to compare Actual and Predicted...

Many Thanks

SAS Employee
Posts: 24

I would use misclassification rate instead. It depends on your data, but I would be ok with a misclassification rate of 0.3 or less. GINI, AUC, c-statistic and logarithmic loss are other common measures for binary classification accuracy.

If you have a traditional binary target whose values are 0 and 1, then you should not use MAPE because you may be dividing by 0. Even if your binary target has different values than 0 and 1, MAPE and others measures like ASE and RMSE are meant for interval targets. These measures help you understand the average distance between your numeric regression predictions and your numeric observed values. In logistic regression, you are doing a classification, not a prediction. You are labeling cases as belonging to one group or another. The distance between these groups might be arbitrary or hard to understand, and that is why we look at the misclassification rate.

If your misclassification rate is between 0.3 and 0.5, then there are many steps you can take to find a more accurate model, with feature selection being the foremost. Have you tried forward, backward or stepwise variable selection? Another common problem with logistic regression is quasi-complete separation. Are any of your parameters greater than 15 or 20?

Also, your data may just be noisy and difficult to model.

Frequent Contributor
Posts: 95

Hi Patrick,

Many thanks for coming back to me...when you say Are any of your parameters greater than 15 or 20? What do you mean exactly...Do you mean the number of predictors?

Thank You

SAS Employee
Posts: 24