Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- SAS Data Science
- /
- How to adjust probability predicted using logistic regression after ov...

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 12-28-2021 05:31 AM
(1651 views)

I have created a logistic regression model using the E-Miner tool where event probability in population base was 0.06, after oversampling I created a base where event probability is 0.2. Now how can I adjust the probabilities according to the population base using SAS code in Enterprise Guide?

I found below mentioned formula in another post from the SAS community-how-to-adjust-probabilities-after-oversampling

P_i** = ( P_i* x *R_0 *x *P_1) / *( (1-P_i*) (*R_1)(P_0) + *(P_i*)(*R_0)(P_1) )*

where:

P_i* is the unadjusted probability you get from your model

*R_0 *and R*_1 *are the sample proportions of 1 and 0 respectively

*P_0 *and P*_1 *are the original event and non_event rates (population rates)

P_i** is the true probability

But using this formula I am getting adjusted probability to be higher than actual probability.

8 REPLIES 8

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Check SCORE statment of PROC LOGISTIC :

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hello,

First of all : Make sure you are not mixing up the (#2) target levels.

And check what the target level is that you are predicting : is you model giving probabilities for level_1 or for level_2?

If the above is OK, here's how you adjust :

Usage Note 22601: Adjusting for oversampling the event level in a binary logistic model

https://support.sas.com/kb/22/601.html

And from the Enterprise Miner documentation :

SAS® Enterprise Miner™ 15.1 Extension Nodes: Developer’s Guide

https://go.documentation.sas.com/doc/en/emxndg/15.1/p1vqpbjwoo4bv7n1sw77e0z64xxs.htm

Cheers,

Koen

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Let me show you the calculation for an example where **Predicted Probability is 0.6**:-

**Predicted Probability=0.6 **

Sample Event Proportion=0.2

Sample Non Event Proportion=0.8

Population Event Proportion=0.06

Population Non Event Proportion=0.94

Adjusted Probability=(0.6*0.2*0.94)/[(0.4*0.8*0.06)+(0.6*0.2*0.94)] = **0.8545**

Adjusted Probability( 0.8545 ) > Predicted Probability(0.6)

Please tell me where I am making a mistake while calculating adjusted probability?

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

```
data _NULL_;
*Predicted Probability=0.6 ; *OldPost(i,t);
*Sample Event Proportion=0.2 ; *OldPrior(t) ;
*Sample Non Event Proportion=0.8 ;
*Population Event Proportion=0.06 ; *Prior(t) ;
*Population Non Event Proportion=0.94 ;
Post_i_t = (0.6 * 0.06 / 0.2) / ( (0.6 * 0.06 / 0.2) + (0.4 * 0.94 / 0.8) );
put Post_i_t=;
run;
```

Post_i_t = 0.2769230769

Cheers,

Koen

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hello,

You are right in expecting the mean of all adjusted probabilities to be (approximately) the event rate in the population base.

I do not know why that's not the case with your data.

However, why are you adjusting these probabilities "manually"?

If you use the Enterprise Miner target profiler, then the correct posterior probabilities (adjusted for the real priors) are automatically returned by the software.

See here :

SAS® Enterprise Miner™ 15.1: Reference Help

Predictive Modeling : https://go.documentation.sas.com/doc/en/emref/15.1/p0qiq0a4vnebuzn16v8fzossk4gp.htm

Enterprise Miner Target Profiler : https://go.documentation.sas.com/doc/en/emref/15.1/n0z1mtvsscypjqn1ediv223jq5iy.htm

Kind regards,

Koen

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hello,

The mathematical formula is in one of my posts above.

(and in the doc : SAS® Enterprise Miner™ 15.1 Extension Nodes: Developer’s Guide

https://go.documentation.sas.com/doc/en/emxndg/15.1/p1vqpbjwoo4bv7n1sw77e0z64xxs.htm).

The formula is also here (marked as solution).

Subject : Urgent, how to adjust probabilities after oversampling? Please Help, Thank you

I have successfully done it the formula-way myself several times, but cannot locate these programs any more.

Good luck,

Koen

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. **Registration is now open through August 30th**. Visit the SAS Hackathon homepage.

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.