BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
pvareschi
Quartz | Level 8

I have just completed the material for "Module 1: Predictive Modeling" course and I would appreciate if someone could clarify/confirm how the logic for decisions work:

  1. Do models (e.g. Decision Trees and Logistic Regression) use a predicted probability of 0.5 as the threshold for classifying a case as the primary outcome (i.e. IF p_predicted>0.5 THEN predicted_class=1)?
  2. If so, is there a way of altering that behaviour? My understanding is that Decision Weights can be used to change the threshold. For instance, if set to the inverse prior probabilities, a case would be classified as primary outcome if its predicted probability was above the prior probability. Is that correct?

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions
Cynthia_sas
SAS Super FREQ

Hi:
Here's feedback from the course instructors:

 

---------------------------------

  1. Yes, 0.5 is the hard-coded default in Enterprise Miner as a cut-off for making decisions.   But this cut-off can be changed.
  2. Yes. One method is to use the “cut-off” node to change the threshold and the other is to use decision weights and/or a profit matrix.  The cut-off node is not included in that class but information about it can be found in the help menu of Enterprise Miner.   Decision weights are cover in Lesson 7 in the section titles “Adjusting for separate sampling” and profit matrices are covered also in Lesson 7 in the section titled “Evaluating Model Profit.”

------------------------------------

 

Hope this helps clarify the default and cut-off for you.

Cynthia

View solution in original post

5 REPLIES 5
Cynthia_sas
SAS Super FREQ

Hi:
Here's feedback from the course instructors:

 

---------------------------------

  1. Yes, 0.5 is the hard-coded default in Enterprise Miner as a cut-off for making decisions.   But this cut-off can be changed.
  2. Yes. One method is to use the “cut-off” node to change the threshold and the other is to use decision weights and/or a profit matrix.  The cut-off node is not included in that class but information about it can be found in the help menu of Enterprise Miner.   Decision weights are cover in Lesson 7 in the section titles “Adjusting for separate sampling” and profit matrices are covered also in Lesson 7 in the section titled “Evaluating Model Profit.”

------------------------------------

 

Hope this helps clarify the default and cut-off for you.

Cynthia

pvareschi
Quartz | Level 8

Thank you for your answer!

Just a quick point: the default 0.5 cut-off, does it apply regardless of the proportions used in the sample? For instance, assuming I do oversampling with a ratio of 0.3 primary outcome / 0.7 secondary outcome, and I do not specify any prior probabilities, am I right in saying 0.5 would still be used as the default cut-off for classification purposes (based on the posterior probabilities)?

Reeza
Super User
The two items aren't directly related. They're both probabilities but not the same type of probability. One is balancing your sample to fit a model, the second predicts the probability of an outcome.
Cynthia_sas
SAS Super FREQ

Hi:

  The instructors' feedback on this question is:

"The default cut-off is always 0.5, regardless of whether or not any sampling has been done to deal with a rare event level.  If the data has been over-sampled to deal with a rare event, then use decision processing (i.e., define prior probabilities) which will adjust the cut-off accordingly.  If over-sampling is done and prior probabilities are not defined, than 0.5 will be used as a cut-off."

 

  Cynthia

Reeza
Super User

1. That may be the default but you can change that with the CUTOFF option in PROC LOGISTIC. Not sure about decision trees because it follows a different algorithm.

 

2. Yes, see the CUTOFF option for PROC LOGISTIC. 

 

The 0.5 cutoff is usually the default but you can change it as needed.