09-28-2015 07:25 PM

Hi! In PROC LOGISTIC one can specify a prior distribution of outcomes using the PEVENT= option on the MODEL statement. So, if the training sample has equal number of events and non-events, but the true proportion of events in the population is 5%, just set PEVENT = 0.05 to adjust for the bias in the sample.

Now, suppose life is a bit more complicated. Let there be a variable called subgroup with values "A" and "B." Perhaps my training sample is split into 20% "A" and 80% "B" but the true population is 5% "A" and 95% "B," and further suppose that the sample has 30% events in subgroup = "A" and 40% events in subgroup = "B," although the "A" population really has 2% events, and the "B" population really has 6% events. Is there a way to use the PEVENT= option (and perhaps this requires another option, too) to adjust the results to the population proportions?

Thanks!

09-29-2015 09:22 AM

I believe that adding a WEIGHT variable can take care of the relative percentages of "A" and "B," but that still leaves me wondering how (and whether) PEVENT and / or other options can properly adjust the percentages of events within each category.