Larry,
First of all, kudos on reading the documentation! I will confess that I missed that detail in the documentation. As a general rule, altering the posterior probabilities to be centered closer to the population values does not change the sort order of the observations. Also, the probability estimated by the model would likely be optimistic even if the data set was not oversampled since the model is typically optimized on the data used to build/validate it. As a result, the best assessment of model performance comes from putting the model into use. SAS Model Manager is a product designed to monitor model performance over time and can perform retraining when the performance declines. Since models do not tend to perform as well in practice as they have on the training/validation data (e.g. because time has passed, market penetration has changed, economic pressures might be different, etc...), I would have had no issue in assigning prior probabilities and decision weights in the Input Data Source node and then including the Ensemble node later. The probabilities themselves are not as much of a concern to me as the sort order of the resulting scored data.
I have talked with one customer for whom the predicted probabilities themselves were quite important, but it is important to note that each observation in the data will either have the event or not in a binary target scenario. Probability only makes sense when looking at subgroups of observations. Since the adjustment for priors really impacts where the probabilities are centered, it is possible that some groups might represent resulting probabilities higher than the adjustment suggested while other groups have probabilities that are lower. The Decisions node allows you to assign weights which can then be multiplied by the probability of each event to determine which outcome is the most profitable (or least costly). In the end, these calcuations are attempting to represent possible business goals. I always recommend a more direct approach, however, where you set up the priors and decisions weights in the Input Data Source so that they are available to the modeling nodes but then focus on the sort order of the results paying less attention to the computed probabilities or the 'Decision' unless the decision weights completely represent the business objective.
When all is said and done, your mileage might differ in which case you might consider trying both approaches -- one specifying decision weights and priors in the Input Data Source node and the other not specifying them at all prior to modeling -- and then choose the approach which seems to perform best on your data. I am doubtful that going through the extra work of setting up a Decisions node after each node which the documentation could be interpreted to suggest will be as good a use of your time as investigating more models. If you are intent on getting probabilities that have been adjusted overall to be more like the population, using the Decisions node after the modeling node is the only way to do that.
Let me know what you think.
Cordially, Doug
... View more