- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Good afternoon all.
I am currently trying to replicate the predicted probabilities result from a proc log in Excel for a Default Probability prediction on a sample of counterpartie.
I ran the Proc Log so that Default = pred1 -- pred8 with pred 1 to 8 a series of financial ratios.
I got all the estimates for my 8 predictors, the intercept calculated by SAS and the predicted= values from SAS for my sample as well.
I started plugging in the MLE estimates along my predictor values, and ran the 1/(1+Exp(-(intercept+pred1*estimate1+....) formula, thinking I would get to the same probability, but it actually is not the case. Differences in the estimates of the default probabilities are huge (but the ranking/ordering is seemingly right) between the SAS prediction and mine.
I must have done something wrong, but I cannot find out what... Any clue would be welcome - I am quite new at this
Is there a way to get the actual values used for the calculation of the predicted probas from SAS, just in case the values used are adapted from the estimates given, or anything else ?
Thank you very much for your opinion.
Best,
S.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You're using a probit link, not a logit link, so your formula of 1/1-p isn't correct, that's for the logit link.
I think its the invnormal or normalcdf (p) excel function you're looking for instead.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Check what you're modelling and does it match 1-Probability (inverse).
In the predicted output there are two probabilities that are specified, one for event=1 and one for event=0.
See the first example (Example 53.1 Stepwise logistic regression and predicted values) in logistic regression under the docs for how to get the predicted values and the observations.
Otherwise show your code used and the excel sheet calculation implemented.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
OK - Interesting.
I used the data in Example 53.1 and verified the results using event=1, but the estimates actually calculate the probability of event=0.
If I mode the event=0 the signs on the parameters are flipped and probability is event=1.
That seems strange to me.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Doh.
The formula is 1/(1+exp(-1*(xB))) or exp(xB)/(1+exp(xB)
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Reeza, thanks for checking.
Here is the code in SAS and an excel extract:
proc logistic data = predicted_ plots(only)=ROC outest = outest descending;
model Defaut =
pred1--pred10 / link = probit ;
/*pred1= RatioEBITDAssets pred2= RatioOrdinaryprofitSales etc etc */
output out = modele predicted = proba;
run;
intercept | pred1 | pred2 | pred3 | pred4 | pred5 | pred6 | pred7 | pred8 | pred9 | pred10 | |
-7.60425 | 0.548002 | 0.472262 | 0.622244 | 0.173969 | -0.43673 | 0.205435 | 2.023561 | 0.577662 | 0.084357 | 3.209399971 | |
pred1 | pred2 | pred3 | pred4 | pred5 | pred6 | pred7 | pred8 | pred9 | pred10 | prob SAS | prob Calc |
1.21801 | 0.390616 | 1.279925 | 0.847633 | 0.291004 | 1.236234 | 0.79677 | 0.983539 | 0.721735 | 0.775943761 | 0.092541611 | 0.998713281 |
0.830005 | 0.445053 | 1.085654 | 0.447736 | 0.49901 | 0.727731 | 0.748994 | 0.217191 | 0.80555 | 0.770496504 | 0.016109634 | 0.996057204 |
1.21801 | 0.445053 | 0.861114 | 0.847633 | 0.291004 | 0.553417 | 0.79677 | 0.605676 | 0.809705 | 0.775943761 | 0.045931346 | 0.997690352 |
0.830005 | 0.390616 | 0.861114 | 0.447736 | 0.291004 | 0.981329 | 0.817381 | 0.605676 | 0.809705 | 0.772191912 | 0.017793245 | 0.997207484 |
1.21801 | 1.053004 | 1.085654 | 0.905838 | 1.042944 | 1.236234 | 0.748994 | 0.954705 | 0.785543 | 0.775943761 | 0.076179617 | 0.998374142 |
0.830005 | 1.053004 | 0.097614 | 0.663831 | 0.49901 | 0.412175 | 0.748994 | 0.217191 | 0.809705 | 0.770496504 | 0.003747653 | 0.994388683 |
0.830005 | 0.445053 | 1.279925 | 0.905838 | 0.49901 | 0.727731 | 0.79677 | 0.977681 | 0.744094 | 0.770496504 | 0.068019116 | 0.998098659 |
I am confident in the SAS results, I only want to calculate something remotely close to that
As you can see, it is not a "mere" issue of which outcom I model, so no p -- (1-p) relationship...
Thanks again,
JP
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You're using a probit link, not a logit link, so your formula of 1/1-p isn't correct, that's for the logit link.
I think its the invnormal or normalcdf (p) excel function you're looking for instead.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Stupid me...
You are right, this fixes everything.
I could have spent ages on it. Thanks a lot Reeza.
Regards,
JP