turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Data Mining
- /
- a rare target using an oversimple approach in SAS ...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-26-2016 04:36 PM

Hi ,

My data source target variable has (~0.0019%) and overall there are 7,240,251 observations,please can you explain through oversample approach(step by step please) or is there any other method to build a model.

I am using sample node now but it seems not working for me

Thanks,

Sathya.

Accepted Solutions

Solution

10-03-2016
07:21 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-27-2016 01:21 PM - edited 09-27-2016 02:20 PM

So actually I think you want to be using the other approach for dealing with rare targets, which is to adjust the posterior probabilities instead of entering the decision weights (those only affect profit, not other fit statistics). So do that, in the Decisions node, you would no longer use the inverse priors on the diagonal of the decision matrix but just revert those to 1's, then you want to click Refresh on the Targets tab, then on the Prior Probabilities tab, enter the original priors for your target (the very rare proportion for your event, e.g.). Now this will apply an adjustment to your posterior probabilities - hopefully you will see better results this way.

To answer your other question, EM_CLASSIFICATION is the generically named variable containing the predictions based on your model. Here are more details about those variables from the Score node:

EM_PROBABILITY |
Probability of Classification |
Posterior probability associated with the predicted classification. That is, it corresponds the maximum of the posterior probabilities, max(P1, P2, ..., Pk). |

EM_EVENTPROBABILITY |
Probability for level n of vnm |
Posterior probability associated with target event. |

EM_CLASSIFICATION |
Prediction for vnm |
I_variable, the prediction variable for a class target. |

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-27-2016 10:55 AM

Please take a look at this post: https://communities.sas.com/t5/SAS-Communities-Library/Tip-How-to-model-a-rare-target-using-an-overs.... This outlines one of the ways you can deal with rare targets. There is also a section "Detecting Rare Classes" under **Analytics**>**Predictive Modeling** in the SAS Enterprise Miner Reference Help.

Hope this helps!

Wendy

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-27-2016 11:34 AM

Thanks Wendy,

I have been through those steps but no luck.

EM_EVENTPROBABILITY coulmn values are >0.50 for 2,41,000 rows ( means>50%).

but SAS eminer will give us less %'s (ex :-0.00678) I guess

please suggest me

Thanks,

sathya.

I have been through those steps but no luck.

EM_EVENTPROBABILITY coulmn values are >0.50 for 2,41,000 rows ( means>50%).

but SAS eminer will give us less %'s (ex :-0.00678) I guess

please suggest me

Thanks,

sathya.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-27-2016 11:55 AM - edited 09-27-2016 11:55 AM

and also after scoring ,it is creating some variables ex:

EM_EVENTPROBABILITY

EM_PROBABILITY

.EM_CLASSIFICATION (in 0s and 1s),

which variable do I need to consider from them as a predicted column

Thanks,

Sathya.

Solution

10-03-2016
07:21 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-27-2016 01:21 PM - edited 09-27-2016 02:20 PM

So actually I think you want to be using the other approach for dealing with rare targets, which is to adjust the posterior probabilities instead of entering the decision weights (those only affect profit, not other fit statistics). So do that, in the Decisions node, you would no longer use the inverse priors on the diagonal of the decision matrix but just revert those to 1's, then you want to click Refresh on the Targets tab, then on the Prior Probabilities tab, enter the original priors for your target (the very rare proportion for your event, e.g.). Now this will apply an adjustment to your posterior probabilities - hopefully you will see better results this way.

To answer your other question, EM_CLASSIFICATION is the generically named variable containing the predictions based on your model. Here are more details about those variables from the Score node:

EM_PROBABILITY |
Probability of Classification |
Posterior probability associated with the predicted classification. That is, it corresponds the maximum of the posterior probabilities, max(P1, P2, ..., Pk). |

EM_EVENTPROBABILITY |
Probability for level n of vnm |
Posterior probability associated with target event. |

EM_CLASSIFICATION |
Prediction for vnm |
I_variable, the prediction variable for a class target. |

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-03-2016 07:21 AM

Thank you Wendy.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-05-2016 04:18 AM

And now I built a model , I want to automate the SAS em project. Is that possible to automate the project either by batch job or by scheduler.(every month with new history table/input table and new score table )

Please help me with this .

Thanks,

Sathya .

Please help me with this .

Thanks,

Sathya .