turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- Replicating predicted probabilities from Proc Logi...

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-18-2012 12:37 PM

Good afternoon all.

I am currently trying to replicate the predicted probabilities result from a proc log in Excel for a Default Probability prediction on a sample of counterpartie.

I ran the Proc Log so that Default = pred1 -- pred8 with pred 1 to 8 a series of financial ratios.

I got all the estimates for my 8 predictors, the intercept calculated by SAS and the predicted= values from SAS for my sample as well.

I started plugging in the MLE estimates along my predictor values, and ran the 1/(1+Exp(-(intercept+pred1*estimate1+....) formula, thinking I would get to the same probability, but it actually is not the case. Differences in the estimates of the default probabilities are huge (but the ranking/ordering is seemingly right) between the SAS prediction and mine.

I must have done something wrong, but I cannot find out what... Any clue would be welcome - I am quite new at this

Is there a way to get the actual values used for the calculation of the predicted probas from SAS, just in case the values used are adapted from the estimates given, or anything else ?

Thank you very much for your opinion.

Best,

S.

Accepted Solutions

Solution

07-19-2012
12:44 PM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to schnitt

07-19-2012 12:44 PM

You're using a probit link, not a logit link, so your formula of 1/1-p isn't correct, that's for the logit link.

I think its the invnormal or normalcdf (p) excel function you're looking for instead.

All Replies

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to schnitt

07-18-2012 01:14 PM

Check what you're modelling and does it match 1-Probability (inverse).

In the predicted output there are two probabilities that are specified, one for event=1 and one for event=0.

See the first example (Example 53.1 Stepwise logistic regression and predicted values) in logistic regression under the docs for how to get the predicted values and the observations.

Otherwise show your code used and the excel sheet calculation implemented.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to schnitt

07-18-2012 01:32 PM

OK - Interesting.

I used the data in Example 53.1 and verified the results using event=1, but the estimates actually calculate the probability of event=0.

If I mode the event=0 the signs on the parameters are flipped and probability is event=1.

That seems strange to me.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to schnitt

07-18-2012 01:50 PM

Doh.

The formula is 1/(1+exp(-1*(xB))) or exp(xB)/(1+exp(xB)

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

07-19-2012 04:57 AM

Reeza, thanks for checking.

Here is the code in SAS and an excel extract:

proc logistic data = predicted_ plots(only)=ROC outest = outest descending;

model Defaut =

pred1--pred10 / link = probit ;

/*pred1= RatioEBITDAssets pred2= RatioOrdinaryprofitSales etc etc */

output out = modele predicted = proba;

run;

intercept | pred1 | pred2 | pred3 | pred4 | pred5 | pred6 | pred7 | pred8 | pred9 | pred10 | |

-7.60425 | 0.548002 | 0.472262 | 0.622244 | 0.173969 | -0.43673 | 0.205435 | 2.023561 | 0.577662 | 0.084357 | 3.209399971 | |

pred1 | pred2 | pred3 | pred4 | pred5 | pred6 | pred7 | pred8 | pred9 | pred10 | prob SAS | prob Calc |

1.21801 | 0.390616 | 1.279925 | 0.847633 | 0.291004 | 1.236234 | 0.79677 | 0.983539 | 0.721735 | 0.775943761 | 0.092541611 | 0.998713281 |

0.830005 | 0.445053 | 1.085654 | 0.447736 | 0.49901 | 0.727731 | 0.748994 | 0.217191 | 0.80555 | 0.770496504 | 0.016109634 | 0.996057204 |

1.21801 | 0.445053 | 0.861114 | 0.847633 | 0.291004 | 0.553417 | 0.79677 | 0.605676 | 0.809705 | 0.775943761 | 0.045931346 | 0.997690352 |

0.830005 | 0.390616 | 0.861114 | 0.447736 | 0.291004 | 0.981329 | 0.817381 | 0.605676 | 0.809705 | 0.772191912 | 0.017793245 | 0.997207484 |

1.21801 | 1.053004 | 1.085654 | 0.905838 | 1.042944 | 1.236234 | 0.748994 | 0.954705 | 0.785543 | 0.775943761 | 0.076179617 | 0.998374142 |

0.830005 | 1.053004 | 0.097614 | 0.663831 | 0.49901 | 0.412175 | 0.748994 | 0.217191 | 0.809705 | 0.770496504 | 0.003747653 | 0.994388683 |

0.830005 | 0.445053 | 1.279925 | 0.905838 | 0.49901 | 0.727731 | 0.79677 | 0.977681 | 0.744094 | 0.770496504 | 0.068019116 | 0.998098659 |

I am confident in the SAS results, I only want to calculate something remotely close to that

As you can see, it is not a "mere" issue of which outcom I model, so no p -- (1-p) relationship...

Thanks again,

JP

Solution

07-19-2012
12:44 PM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to schnitt

07-19-2012 12:44 PM

You're using a probit link, not a logit link, so your formula of 1/1-p isn't correct, that's for the logit link.

I think its the invnormal or normalcdf (p) excel function you're looking for instead.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

07-20-2012 04:45 AM

Stupid me...

You are right, this fixes everything.

I could have spent ages on it. Thanks a lot Reeza.

Regards,

JP