Good afternoon all.
I am currently trying to replicate the predicted probabilities result from a proc log in Excel for a Default Probability prediction on a sample of counterpartie.
I ran the Proc Log so that Default = pred1 -- pred8 with pred 1 to 8 a series of financial ratios.
I got all the estimates for my 8 predictors, the intercept calculated by SAS and the predicted= values from SAS for my sample as well.
I started plugging in the MLE estimates along my predictor values, and ran the 1/(1+Exp(-(intercept+pred1*estimate1+....) formula, thinking I would get to the same probability, but it actually is not the case. Differences in the estimates of the default probabilities are huge (but the ranking/ordering is seemingly right) between the SAS prediction and mine.
I must have done something wrong, but I cannot find out what... Any clue would be welcome - I am quite new at this
Is there a way to get the actual values used for the calculation of the predicted probas from SAS, just in case the values used are adapted from the estimates given, or anything else ?
Thank you very much for your opinion.
Best,
S.
You're using a probit link, not a logit link, so your formula of 1/1-p isn't correct, that's for the logit link.
I think its the invnormal or normalcdf (p) excel function you're looking for instead.
Check what you're modelling and does it match 1-Probability (inverse).
In the predicted output there are two probabilities that are specified, one for event=1 and one for event=0.
See the first example (Example 53.1 Stepwise logistic regression and predicted values) in logistic regression under the docs for how to get the predicted values and the observations.
Otherwise show your code used and the excel sheet calculation implemented.
OK - Interesting.
I used the data in Example 53.1 and verified the results using event=1, but the estimates actually calculate the probability of event=0.
If I mode the event=0 the signs on the parameters are flipped and probability is event=1.
That seems strange to me.
Doh.
The formula is 1/(1+exp(-1*(xB))) or exp(xB)/(1+exp(xB)
Reeza, thanks for checking.
Here is the code in SAS and an excel extract:
proc logistic data = predicted_ plots(only)=ROC outest = outest descending;
model Defaut =
pred1--pred10 / link = probit ;
/*pred1= RatioEBITDAssets pred2= RatioOrdinaryprofitSales etc etc */
output out = modele predicted = proba;
run;
intercept | pred1 | pred2 | pred3 | pred4 | pred5 | pred6 | pred7 | pred8 | pred9 | pred10 | |
-7.60425 | 0.548002 | 0.472262 | 0.622244 | 0.173969 | -0.43673 | 0.205435 | 2.023561 | 0.577662 | 0.084357 | 3.209399971 | |
pred1 | pred2 | pred3 | pred4 | pred5 | pred6 | pred7 | pred8 | pred9 | pred10 | prob SAS | prob Calc |
1.21801 | 0.390616 | 1.279925 | 0.847633 | 0.291004 | 1.236234 | 0.79677 | 0.983539 | 0.721735 | 0.775943761 | 0.092541611 | 0.998713281 |
0.830005 | 0.445053 | 1.085654 | 0.447736 | 0.49901 | 0.727731 | 0.748994 | 0.217191 | 0.80555 | 0.770496504 | 0.016109634 | 0.996057204 |
1.21801 | 0.445053 | 0.861114 | 0.847633 | 0.291004 | 0.553417 | 0.79677 | 0.605676 | 0.809705 | 0.775943761 | 0.045931346 | 0.997690352 |
0.830005 | 0.390616 | 0.861114 | 0.447736 | 0.291004 | 0.981329 | 0.817381 | 0.605676 | 0.809705 | 0.772191912 | 0.017793245 | 0.997207484 |
1.21801 | 1.053004 | 1.085654 | 0.905838 | 1.042944 | 1.236234 | 0.748994 | 0.954705 | 0.785543 | 0.775943761 | 0.076179617 | 0.998374142 |
0.830005 | 1.053004 | 0.097614 | 0.663831 | 0.49901 | 0.412175 | 0.748994 | 0.217191 | 0.809705 | 0.770496504 | 0.003747653 | 0.994388683 |
0.830005 | 0.445053 | 1.279925 | 0.905838 | 0.49901 | 0.727731 | 0.79677 | 0.977681 | 0.744094 | 0.770496504 | 0.068019116 | 0.998098659 |
I am confident in the SAS results, I only want to calculate something remotely close to that
As you can see, it is not a "mere" issue of which outcom I model, so no p -- (1-p) relationship...
Thanks again,
JP
You're using a probit link, not a logit link, so your formula of 1/1-p isn't correct, that's for the logit link.
I think its the invnormal or normalcdf (p) excel function you're looking for instead.
Stupid me...
You are right, this fixes everything.
I could have spent ages on it. Thanks a lot Reeza.
Regards,
JP
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.