Hi,
I'm trying to understand how EM calculates the no. of events vs non-events in each ranked demi-decile after adjusting for prior probabilities.
In my original data, I have 1% events and 99% non-events.
In my sample data for model development, I have 20% events and 80% non-events.
I apply a random forest to my sample data. The model predicts that I have in my 1st bin (i.e. demi-decile with the highest scores), 343 true events and 23 true non-events.
After applying the decision node to my model results, I now have in my 1st bin (i.e. the demi-decile with the highest ADJUSTED scores), 36 true events and 332 true non-events. How was this actually determined? I understand how the posterior probabilities are adjusted but I don't understand how the no. of true events and non-events are adjusted.
Appreciate if someone can help to explain this.
There are two different issues involved here -- the first is obtaining probabilities centered near your population estimates and the other is determining how to classify each observation based on that probability (adjusted for priors or not) and a decision weight if you have incorporated one. By default, SAS Enterprise Miner generates a misclassification chart for the Train & Validate data sets based on two variables which have the form
F_<target variable name> : the actual target level
I _<target variable name> : the predicted target level
SAS Enterprise Miner will compute a predicted probability (adjusted for priors if requested) for each level of the target of the form
P_<target variable name><target variable level>
So for a target variable named 'BAD' with levels 0 or 1, it will generate
P_BAD1 : the predicted probability that BAD=1
P_BAD0 : the predicted probability that BAD=0
Using my example, the variable F_BAD is simply the actual target level (0 or 1) and the variable I_BAD will take the level associated with the highest predicted probability P_BAD1 and P_BAD0. It is reasonable to assign observations to the target level which is most likely but this presents problems in rare event scenarios.
In your oversampled data, your target level of interest occurred 20% of the time overall. Using my example, suppose that BAD=1 occurs 20% of the time in the sample. To have P_BAD1 > P_BAD0, the observation had to have P_BAD1 > 50% which represents someone at least(50%) / (20%) = 2.5 times as likely to have the event compared to the overall average. After adjusting for the prior probabilities to have the overall average only 1%, you would now need someone who was at least (50%) / (1%) = 50 times as likely to have the rare event as the predicted event. Since there are far fewer people in this category, there are far fewer people (possibly none!) classified as having the rare event according to I_BAD (using my example). This is why the number of predicted events changes so dramatically in your example.
In these situations, you can consider using a target weight to put more weight on the rare event. If you do add Decision weights (either in the Decisions node or in the Input Data Source node), SAS Enterprise Miner will also generate a D _<target variable name> which contains the 'decision' outcome based on the 'most profitable' or 'least costly' outcome. In this situation, the decision weight is multiplied by the adjusted probability to get the 'expected value' of the decision and the outcome is assigned based on the best outcome.
Assigning outcomes based on putting extra decision weight on rare events can also pose challenges since those outcomes will be predicted to occur more often than they actually do. If you click on the button 'Default with Inverse Prior Weights', SAS Enterprise Miner will take the specified prior and divide it into 1 to obtain the weight. Suppose the prior probabilities were specified as 20% and 80%. Then using the 'Default with Inverse Prior Weights' button would yield weights of 1 / 0.2 = 5 for the rare event and 1 / 0.8 = 1.25 for the common event. You will notice that the ratio of weights
5 / 1.25 = 4
is in the same ratio as the prior probabilities
80% / 20% = 4
so simply leaving the weight on the common event as 1 and changing the rare event to have a weight of 4 will have the same impact. Notice now that for the 'average' observation who has a probability of the rare event as 20% (or 0.2) and probability of the common event of 80% (or 0.8), you can see the expected value is the same using the weights as described above:
Level Prior Weight Expected Value
rare event 0.2 4 0.2 * 4 = 0.8
common event 0.8 1 0.8 * 1 = 0.8
which suggests that using the 'Default with Inverse Prior Weights' will assign anyone with a probability higher than 0.2 (in this scenario) to have the target event which corresponds to anyone with a higher predicted probability than average. This will generate a lot more predicted events based on the D_<variable name> variable since it is not unlikely that half or more of the observations have a predicted probability higher than average.
So what do you do? Understand that the overall misclassification rate of the data set is not what is critical. Look at the rate in each percentile of the data and determine how deep you want to go. Then you can choose your own Decision threshold (e.g. probability higher than 0.35) above which you get a satisfactory misclassification rate. The approach taken by SAS Enterprise Miner is a reasonable one since it has no business knowledge to base the outcome on other than what is provided -- either pick the most likely outcome or the most valuable outcome based on your weights -- but your best decisions will always incorporate your analytical needs.
For example, in some cases you might need an extremely low misclassification rate (e.g. maybe only looking at the top 1% or 2% of the scored data) because you are searching for fraud and don't want to annoy customers that are not acting fraudulently. In other cases, you might be looking for a minimum response rate to make money (e.g. some direct mail advertisers only need a 2% response rate to be profitable). Your best 'decision' should always incorporate your analytical and/or business objectives.
I hope this helps!
Doug
There are two different issues involved here -- the first is obtaining probabilities centered near your population estimates and the other is determining how to classify each observation based on that probability (adjusted for priors or not) and a decision weight if you have incorporated one. By default, SAS Enterprise Miner generates a misclassification chart for the Train & Validate data sets based on two variables which have the form
F_<target variable name> : the actual target level
I _<target variable name> : the predicted target level
SAS Enterprise Miner will compute a predicted probability (adjusted for priors if requested) for each level of the target of the form
P_<target variable name><target variable level>
So for a target variable named 'BAD' with levels 0 or 1, it will generate
P_BAD1 : the predicted probability that BAD=1
P_BAD0 : the predicted probability that BAD=0
Using my example, the variable F_BAD is simply the actual target level (0 or 1) and the variable I_BAD will take the level associated with the highest predicted probability P_BAD1 and P_BAD0. It is reasonable to assign observations to the target level which is most likely but this presents problems in rare event scenarios.
In your oversampled data, your target level of interest occurred 20% of the time overall. Using my example, suppose that BAD=1 occurs 20% of the time in the sample. To have P_BAD1 > P_BAD0, the observation had to have P_BAD1 > 50% which represents someone at least(50%) / (20%) = 2.5 times as likely to have the event compared to the overall average. After adjusting for the prior probabilities to have the overall average only 1%, you would now need someone who was at least (50%) / (1%) = 50 times as likely to have the rare event as the predicted event. Since there are far fewer people in this category, there are far fewer people (possibly none!) classified as having the rare event according to I_BAD (using my example). This is why the number of predicted events changes so dramatically in your example.
In these situations, you can consider using a target weight to put more weight on the rare event. If you do add Decision weights (either in the Decisions node or in the Input Data Source node), SAS Enterprise Miner will also generate a D _<target variable name> which contains the 'decision' outcome based on the 'most profitable' or 'least costly' outcome. In this situation, the decision weight is multiplied by the adjusted probability to get the 'expected value' of the decision and the outcome is assigned based on the best outcome.
Assigning outcomes based on putting extra decision weight on rare events can also pose challenges since those outcomes will be predicted to occur more often than they actually do. If you click on the button 'Default with Inverse Prior Weights', SAS Enterprise Miner will take the specified prior and divide it into 1 to obtain the weight. Suppose the prior probabilities were specified as 20% and 80%. Then using the 'Default with Inverse Prior Weights' button would yield weights of 1 / 0.2 = 5 for the rare event and 1 / 0.8 = 1.25 for the common event. You will notice that the ratio of weights
5 / 1.25 = 4
is in the same ratio as the prior probabilities
80% / 20% = 4
so simply leaving the weight on the common event as 1 and changing the rare event to have a weight of 4 will have the same impact. Notice now that for the 'average' observation who has a probability of the rare event as 20% (or 0.2) and probability of the common event of 80% (or 0.8), you can see the expected value is the same using the weights as described above:
Level Prior Weight Expected Value
rare event 0.2 4 0.2 * 4 = 0.8
common event 0.8 1 0.8 * 1 = 0.8
which suggests that using the 'Default with Inverse Prior Weights' will assign anyone with a probability higher than 0.2 (in this scenario) to have the target event which corresponds to anyone with a higher predicted probability than average. This will generate a lot more predicted events based on the D_<variable name> variable since it is not unlikely that half or more of the observations have a predicted probability higher than average.
So what do you do? Understand that the overall misclassification rate of the data set is not what is critical. Look at the rate in each percentile of the data and determine how deep you want to go. Then you can choose your own Decision threshold (e.g. probability higher than 0.35) above which you get a satisfactory misclassification rate. The approach taken by SAS Enterprise Miner is a reasonable one since it has no business knowledge to base the outcome on other than what is provided -- either pick the most likely outcome or the most valuable outcome based on your weights -- but your best decisions will always incorporate your analytical needs.
For example, in some cases you might need an extremely low misclassification rate (e.g. maybe only looking at the top 1% or 2% of the scored data) because you are searching for fraud and don't want to annoy customers that are not acting fraudulently. In other cases, you might be looking for a minimum response rate to make money (e.g. some direct mail advertisers only need a 2% response rate to be profitable). Your best 'decision' should always incorporate your analytical and/or business objectives.
I hope this helps!
Doug
Hi,
I am building a behavioral scoring model to calculate probability of default. Where could i see the estimated probability of my model in SAS Enterprise miner??
Thanks,
Anshul.
I am building a behavioral scoring model to calculate probability of default. Where could i see the estimated probability of my model in SAS Enterprise miner??
The additional columns I described in my previous note are added to any train/validate/test data set that is passed to a modeling node in SAS Enterprise Miner as well as any score (Role=Score) data set passed to a subsequent Score node in SAS Enterprise Miner. You can view a sample of the data containing these additional columns by clicking on a modeling node and then clicking on the ellipsis (...) to the right of Exported Data in the properties sheet to the left. You can then highlight the row corresponding to any of the available data sets and then click on Browse or Explore to see the variables that have been added. By default, they will be labeled something like "Predicted: < target variable name > = < target variable level > " for a categorical target variable. You can right-click on a column heading and choose "Name" to see the actual variable name which is of the form "P_< target variable name > < target variable level >". So in my previous example, a target variable BAD taking on values BAD=1 or BAD=0 would contain probabilities in a column named P_BAD1 or P_BAD0 which would have lables "Predicted: BAD=1" or "Predicted: BAD=0" respectively.
Hope this helps!
Cordially,
Doug
"SAS Enterprise Miner will compute a predicted probability (adjusted for priors if requested) for each level of the target of the form"
Doug, would you have any detail on exactly how SAS makes the adjustment for priors? I'm looking to understand the calculation and the rationale behind the calculation.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.