BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
pvareschi
Quartz | Level 8

Perhaps this is something well known to everybody but to me it has been a surprise: after doing some tests on fitting models with and without defining prior probabilities I have noticed that Enterprise Miner does not take account of prior probabilities when calculating Average Square Error (ASE) (same applies to the calculation of residuals as saved on output datasets)

That being the case, I just want to clarify whether there is any chance/scenario under which we would end up choosing a different model (out of a sequence of models of increasing complexity - e.g. Regression) if ASE was indeed adjusted for prior probabilities.

My instinct tells me that is not the case, but I wonder whether there is a more mathematical justification for that.

1 ACCEPTED SOLUTION

Accepted Solutions
gcjfernandez
SAS Employee

I agree with your comments because adjusting for prior probability basically only shifting the intercept values. Therefore this should not affect the model selection. However, if you want the prior values affect your model decision you should consider the decision option and provide decision weights (Please refer Chapter 6 in the AAEM course notes)

View solution in original post

5 REPLIES 5
pvareschi
Quartz | Level 8

Just to further clarify, I am referring to "Applied Analytics Using SAS Enterprise Miner", "Lesson 7: Model Assessment Using SAS Enterprise Miner", "Adjusting for Separate Sampling": if we do not specify prior probabilities, we know that performance metrics are inaccurate and/or biased; however, what I am concerned is whether it would affect the choice of the "best model", especially when applied to a single modelling node to assess model complexity. My understanding is that it would not be the case, at least when using ASE or misclassification rate (Profit/Loss would be a different matter)

gcjfernandez
SAS Employee

I agree with your comments because adjusting for prior probability basically only shifting the intercept values. Therefore this should not affect the model selection. However, if you want the prior values affect your model decision you should consider the decision option and provide decision weights (Please refer Chapter 6 in the AAEM course notes)

pvareschi
Quartz | Level 8

Just a further clarification on the statement "because adjusting for prior probability basically only shifting the intercept values": that is true for linear models (i.e. Logistic Regression); but what about non-parametric or non-linear models such as Decision Trees and Neural Networks? Would that still just result in a shift of the intercept values?

gcjfernandez
SAS Employee

Your question:

Just a further clarification on the statement "because adjusting for prior probability basically only shifting the intercept values": that is true for linear models (i.e. Logistic Regression); but what about non-parametric or non-linear models such as Decision Trees and Neural Networks? Would that still just result in a shift of the intercept values?

My answer:

When the target variable is binary we call this predictive model a classification model and the goal of Decision tree, Logistic regression, or NN is to classy the binary target correctly. All these models create  all possible pairs of one event and one non event and if these models correctly classify one pair at a time then they are called concordance pair. Otherwise discordance pair. Therefore by random chance there is 50% chance of finding  the event within a pair. We hope that the model we develop will have a higher chance of differentiating event from the non event.  These statistics (% of concordance and discordance) are the basis of ROC index. ROC index is not influenced by the Prior probability. Therefore ROC index is a popular model comparison statistics. Also the proportion of events to non events in the population is not considered when developing classification models by default.

 

However, if the goal of scoring is computing posterior probabilities, then the posterior probabilities needs to be adjusted for prior probability after we develop the model. This adjustment will be the same whether we use Decision tree (base line adjustment ), logistic regression or NN(Intercept, offset, bias). Because in this prior probability adjustment (non-linear component of the model is not included).

I hope this explanation is adequate

pvareschi
Quartz | Level 8

Thank you for your explanation; very thorough!

 

This is a knowledge-sharing community for learners in the Academy. Find answers to your questions or post here for a reply.
To ensure your success, use these getting-started resources:

Estimating Your Study Time
Reserving Software Lab Time
Most Commonly Asked Questions
Troubleshooting Your SAS-Hadoop Training Environment

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 838 views
  • 0 likes
  • 2 in conversation