Solved: Re: a rare target using an oversimple approach in SAS Enterprise Miner

sathya66 · Posted 09-26-2016 04:36 PM

Hi ,

My data source target variable has (~0.0019%) and overall there are 7,240,251 observations,please can you explain through oversample approach(step by step please) or is there any other method to build a model.

I am using sample node now but it seems not working for me

Thanks,

Sathya.

WendyCzika · Posted 09-27-2016 01:21 PM

So actually I think you want to be using the other approach for dealing with rare targets, which is to adjust the posterior probabilities instead of entering the decision weights (those only affect profit, not other fit statistics). So do that, in the Decisions node, you would no longer use the inverse priors on the diagonal of the decision matrix but just revert those to 1's, then you want to click Refresh on the Targets tab, then on the Prior Probabilities tab, enter the original priors for your target (the very rare proportion for your event, e.g.). Now this will apply an adjustment to your posterior probabilities - hopefully you will see better results this way.

To answer your other question, EM_CLASSIFICATION is the generically named variable containing the predictions based on your model. Here are more details about those variables from the Score node:

EM_PROBABILITY	Probability of Classification	Posterior probability associated with the predicted classification. That is, it corresponds the maximum of the posterior probabilities, max(P1, P2, ..., Pk).
EM_EVENTPROBABILITY	Probability for level n of vnm	Posterior probability associated with target event.
EM_CLASSIFICATION	Prediction for vnm	I_variable, the prediction variable for a class target.

View solution in original post

WendyCzika · Posted 09-27-2016 10:55 AM

Please take a look at this post: https://communities.sas.com/t5/SAS-Communities-Library/Tip-How-to-model-a-rare-target-using-an-overs.... This outlines one of the ways you can deal with rare targets. There is also a section "Detecting Rare Classes" under Analytics>Predictive Modeling in the SAS Enterprise Miner Reference Help.

Hope this helps!

Wendy

sathya66 · Posted 09-27-2016 11:34 AM

Thanks Wendy,
I have been through those steps but no luck.
EM_EVENTPROBABILITY coulmn values are >0.50 for 2,41,000 rows ( means>50%).
but SAS eminer will give us less %'s (ex :-0.00678) I guess
please suggest me
Thanks,
sathya.

sathya66 · Posted 09-27-2016 11:55 AM

and also after scoring ,it is creating some variables ex:
EM_EVENTPROBABILITY
EM_PROBABILITY
.EM_CLASSIFICATION (in 0s and 1s),
which variable do I need to consider from them as a predicted column
Thanks,
Sathya.

WendyCzika · Posted 09-27-2016 01:21 PM

So actually I think you want to be using the other approach for dealing with rare targets, which is to adjust the posterior probabilities instead of entering the decision weights (those only affect profit, not other fit statistics). So do that, in the Decisions node, you would no longer use the inverse priors on the diagonal of the decision matrix but just revert those to 1's, then you want to click Refresh on the Targets tab, then on the Prior Probabilities tab, enter the original priors for your target (the very rare proportion for your event, e.g.). Now this will apply an adjustment to your posterior probabilities - hopefully you will see better results this way.

To answer your other question, EM_CLASSIFICATION is the generically named variable containing the predictions based on your model. Here are more details about those variables from the Score node:

EM_PROBABILITY	Probability of Classification	Posterior probability associated with the predicted classification. That is, it corresponds the maximum of the posterior probabilities, max(P1, P2, ..., Pk).
EM_EVENTPROBABILITY	Probability for level n of vnm	Posterior probability associated with target event.
EM_CLASSIFICATION	Prediction for vnm	I_variable, the prediction variable for a class target.

sathya66 · Posted 10-03-2016 07:21 AM

Thank you Wendy.

sathya66 · Posted 10-05-2016 04:18 AM

And now I built a model , I want to automate the SAS em project. Is that possible to automate the project either by batch job or by scheduler.(every month with new history table/input table and new score table )
Please help me with this .
Thanks,
Sathya .

SAS Innovate 2025: Save the Date