turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Data Mining
- /
- SAS EM prior probabilities

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

09-05-2016 02:10 PM

I'm following the Getting Started with SAS Enterprise Miner example: https://support.sas.com/documentation/onlinedoc/miner/. If I do not adjust the prior probabilities to 0.05/0.95 as suggested, but use 0.25/0.75 instead, the regression and tree models produce models with ROC curves that appear to be y=x. In other words, they are like flipping a coin. It seems that the models place every observation in the class with the larger posterior probability.

It would seem to me that adjusting the prior probabilities to 0.05/0.95 would make things worse. The help states, "Increasing the prior probability of a class increases the posterior probability, moving the classification boundary so that more cases are classified into the class." However, when you do that, the decision tree has splits and both the ROC curves are concave(ish).

Why do the models produced with 0.05/0.95 priors produce "better" results than the models produced with 0.25/0.75 priors?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Honest_Abe

09-13-2016 11:20 AM

Hello,
When you set it at 0.25/0.75 (assuming 0.25 for target=1 and 0.75 for target=0), you are telling the software(here is EM. The same if you set the weight statement variable value as such) that the effective, logic event rate on the incoming target value is 25%. In marketing term, you already have historical response rate at 25%. Many, if not all marketing managers, would ask why we need a response model, because normally when the past response rate is <=5% people would think building a response model would make sense to boost it. When you set it to 0.05 vs 0.95, you are telling EM the incoming historical event/response rate is 5%. Therefore, with 25 vs 75, your model is OK, just there is little room for improve so the ROC appears just like the 45 degree random toss line. When you have 5 vs 95, the curve appears 'normal'. This, of course, is the case if you hold other things unchanged. Hope this helps? Thanks for using SAS. Jason Xin

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Honest_Abe

09-13-2016 12:41 PM

Hi Honest_Abe,

On the page where the instructions call out the edits to the "Prior Probabilities" tab, there's other instructions around making changes on the "Decision Weights" tab as well. If you made those changes, then the model assessment criterion is based on those decision weights. If you use the numbers in the book, that means that each row in your data set represents a 25% chance of making $14.50, and a 75% chance of losing $0.50 (or making -$0.50, equivalently). Think about betting $0.50 to win a $15.00 jackpot.

If your probability of winning that bet (in the population, no models, no predictions) was 25%, then you should probably take that bet all day, every day! Your expected "winnings" are (.25 x $14.5) + (.75 x -$0.50) = $3.25. Don't build a model, just take that bet [This is what I suspect happened in your case. Enterprise Miner built a series of trees and a series of regressions, but none of them could beat this average profit figure, so it "Occam's Razor"-ed you and took the simplest model that gave the best results. It's hard to get a simpler model than "mail everybody, all of the time," so that's what the tree and the regression gave you.

But! What if you lived in a world where the baseline "success" rate (the probability that TARGET_B=1) is closer to 5% than 25%? Then for each trial, your expected profit is (0.05 x $14.5)+(0.95 x -$0.50) = $0.25. Now we're talking about betting $0.50 to try and win $0.25. (In many real world cases, the expectation is negative, and you're worse off than that.) So how do we gain an advantage? Build a model, and target the sub-population that has a favorable expected value: everyone gets a predicted probability (p), and you can choose to mail only if (p x $14.5)+[(1-p) x -$0.50] comes out to be "large enough." "Large enough" could mean "positive," or it could be subject to some other constraints/considerations.

If you don't know good numbers for those profit/loss values, or this expected-value argument doesn't meet your needs, then you can actually tell the model nodes to select the best model according to validation data average squared error and you'll probably get results more in line with your expectations.

If you didn't put any values in the "Decision Weight" tab, then I need to re-visit your question. If you have any other details about steps you were experimenting with, that might be a good clue.

Thanks!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

10-16-2016 06:46 PM

I followed all of the instructions except for adjusting the prior probabilities. So I did put the weights of $14.50 and -$0.50 in. dtk's answer would make sense if all observations were placed in the positive group, but all observations are placed in the negative group when the prior probabilities are 0.25/0.75.

With both 0.25/0/75 and 0.05/0.95 priors, all observations are placed in the negative group according to the validation classification matrix. But the ROC curve is y=x for 0.25/0.75 but is more concave for 0.05/0.95.

My initial post should have read, "It seems that the models place every observation in the class with the larger *prior* probability."

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Honest_Abe

10-17-2016 08:42 AM

Hi Honest_Abe,

Are you getting any indications in the model results that it's using the profit information? Do you see anything like "Average Profit" or "Total Profit" in the assessment statistics? Does the model that you're using have a property where you can direct it to choose the best model iteration based on "Validation Profit" or something comparable?

It sounds like the decisions coming from the modeling node are based on the "Misclassification," which would just assign an observation to the category with the highest predicted probability.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

10-17-2016 10:14 PM

Yes, for the Decision Tree node, Assessment Measure is set to its default "Decision." Other settings are as the book indicates. The same is true for Regression. Profit is mentioned in the output at each of the model nodes and the Model Comparison node:

Fit Statistics

Model Selection based on Valid: Average Profit for TARGET_B (_VAPROF_)

Valid: Train: Valid:

Average Average Train: Average Valid:

Selected Model Model Profit for Squared Misclassification Squared Misclassification

Model Node Description TARGET_B Error Rate Error Rate

Y Tree Decision Tree 3.24914 0.18752 0.25005 0.18747 0.24994

Reg Regression 3.24914 0.18752 0.25005 0.18747 0.24994