BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
schnitt
Calcite | Level 5

Good afternoon all.

I am currently trying to replicate the predicted probabilities result from a proc log in Excel for a Default Probability prediction on a sample of counterpartie.

I ran the Proc Log so that Default = pred1 -- pred8 with pred 1 to 8 a series of financial ratios.

I got all the estimates for my 8 predictors, the intercept calculated by SAS and the predicted= values from SAS for my sample as well.

I started plugging in the MLE estimates along my predictor values, and ran the 1/(1+Exp(-(intercept+pred1*estimate1+....) formula, thinking I would get to the same probability, but it actually is not the case. Differences in the estimates of the default probabilities are huge (but the ranking/ordering is seemingly right) between the SAS prediction and mine.

I must have done something wrong, but I cannot find out what... Any clue would be welcome - I am quite new at this Smiley Happy

Is there a way to get the actual values used for the calculation of the predicted probas from SAS, just in case the values used are adapted from the estimates given, or anything else ?

Thank you very much for your opinion.

Best,

S.

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

You're using a probit link, not a logit link, so your formula of 1/1-p isn't correct, that's for the logit link.

I think its the invnormal or normalcdf (p) excel function you're looking for instead.

View solution in original post

6 REPLIES 6
Reeza
Super User

Check what you're modelling and does it match 1-Probability (inverse).

In the predicted output there are two probabilities that are specified, one for event=1 and one for event=0.

See the first example (Example 53.1 Stepwise logistic regression and predicted values) in logistic regression under the docs for how to get the predicted values and the observations.

Otherwise show your code used and the excel sheet calculation implemented.

Reeza
Super User

OK - Interesting.

I used the data in Example 53.1 and verified the results using event=1, but the estimates actually calculate the probability of event=0.

If I mode the event=0 the signs on the parameters are flipped and probability is event=1.

That seems strange to me.

Reeza
Super User

Doh.

The formula is 1/(1+exp(-1*(xB))) or exp(xB)/(1+exp(xB)

schnitt
Calcite | Level 5

Reeza, thanks for checking.

Here is the code in SAS and an excel extract:

proc logistic data = predicted_ plots(only)=ROC  outest = outest descending;

model Defaut =

pred1--pred10  /  link = probit ;

/*pred1= RatioEBITDAssets pred2= RatioOrdinaryprofitSales etc etc */

output out = modele predicted = proba;

run;

interceptpred1pred2pred3pred4pred5pred6pred7pred8pred9pred10
-7.604250.5480020.4722620.6222440.173969-0.436730.2054352.0235610.5776620.0843573.209399971
pred1pred2pred3pred4pred5pred6pred7pred8pred9pred10prob SASprob Calc
1.218010.3906161.2799250.8476330.2910041.2362340.796770.9835390.7217350.7759437610.0925416110.998713281
0.8300050.4450531.0856540.4477360.499010.7277310.7489940.2171910.805550.7704965040.0161096340.996057204
1.218010.4450530.8611140.8476330.2910040.5534170.796770.6056760.8097050.7759437610.0459313460.997690352
0.8300050.3906160.8611140.4477360.2910040.9813290.8173810.6056760.8097050.7721919120.0177932450.997207484
1.218011.0530041.0856540.9058381.0429441.2362340.7489940.9547050.7855430.7759437610.0761796170.998374142
0.8300051.0530040.0976140.6638310.499010.4121750.7489940.2171910.8097050.7704965040.0037476530.994388683
0.8300050.4450531.2799250.9058380.499010.7277310.796770.9776810.7440940.7704965040.0680191160.998098659

I am confident in the SAS results, I only want to calculate something remotely close to that Smiley Happy

As you can see, it is not a "mere" issue of which outcom I model, so no p -- (1-p) relationship...

Thanks again,

JP

Reeza
Super User

You're using a probit link, not a logit link, so your formula of 1/1-p isn't correct, that's for the logit link.

I think its the invnormal or normalcdf (p) excel function you're looking for instead.

schnitt
Calcite | Level 5

Stupid me...

You are right, this fixes everything.

I could have spent ages on it. Thanks a lot Reeza.

Regards,

JP

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 6931 views
  • 0 likes
  • 2 in conversation