Thank you Wendy. I'm using inverse priors in the decision matrix, so would the miss classification rate of, lets say a decision tree take into account that the data is sampled. Here's the situation driving my question: In situations where I deal with rare events (event happens in 5% of data), I'll sometimes get a missclass. rate of lets say,15% on validation data. I then try oversampling (w/inverse priors of course), increasing the event proportion from 5% to (10%, or 20%, or 30%, ect.) and I end up getting missclass rates higher than the original 15%. Is there a way to compare against different subsampling proportions? SAS's training material usually suggests oversampling in situations of rare events, but I've been experiencing worse results when I do this.
... View more