Solved
Contributor
Posts: 22

# SAS predicted probabilities vs. hand calculation

[ Edited ]

Hi,

I have a follow up question to the original posted earlier today (listed below in bold).

How come there are slight differences in the predicted probabilities between those calculated by SAS:

proc logistic data=temp desc;

class EDU (ref='0') SEX (REF='0')  MRT(REF='0');

model outcome=AGE SEX EDU MRT;

output out=pp pred=pred;

run;

and using the standard formula:

pred=1/(1+exp(-1*(Intercept+ (AGE_P*AGE)+(SEX_P*SEX)+(EDU_P*EDU)+(MRT_P*MRT);

Assuming Intercept, AGE_P, SEX_P, EDU_P, AND MRT_P are the coefficients provided by SAS, multiplied by the actual values per person.

The min. difference between the predicted probabilities produced by SAS and by hand is .00003, the max, .0815, and the mean .0219.

Thanks!

Emily

Hello,

I am trying to use the %str function to modify my code. Here it is:

%macro predictor;

data new;

set old;

pred=1/(1+exp(-1*(Intercept+ (AGE_P*AGE)+(SEX_P*SEX)+(EDU_P*EDU)

%if MRT=1 %then %str(

+ (MRT_P1))));

);

%else %if MRT=2 %then %str(

+ (MRT_P2))));

);

%else %str(

)));

);

run;

%mend predictor;

%predictor;

When I plug what's in the string functions individually, it works, so I know it doesn't have anything to do with how I coded the content (i.e. missing parentheses/semicolons). I've also used this function before in similar contexts, and never had a problem.

I'm getting the dreaded "There is no matching %IF statement for the %ELSE...A dummy macro will be compiled," error.

Any help would be appreciated.

Emily

Accepted Solutions
Solution
‎02-14-2018 02:14 PM
Super User
Posts: 23,357

## Re: SAS predicted probabilities vs. hand calculation

Posted in reply to epstewart1110

99.9% of time the error is on the user side. In this case, I suspect you're getting small rounding errors from possibly typing out your variables.

Either way, you can use the CODE statement in PROC LOGISTIC to see the code SAS would generate to calculate the predicted probabilities.

Or you can see this example that illustrates how to check it exactly:

https://communities.sas.com/t5/SAS-Statistical-Procedures/How-to-determine-logistic-regression-formu...

The Neuralgia2 data set is part of the PROC LOGISTIC Examples.

Oh and you probably want PARAM =REF on your CLASS statement.

epstewart1110 wrote:

Hi,

I have a follow up question to the original posted earlier today (listed below in bold).

How come there are slight differences in the predicted probabilities between those calculated by SAS:

proc logistic data=temp desc;

class EDU (ref='0') SEX (REF='0')  MRT(REF='0');

model outcome=AGE SEX EDU MRT;

output out=pp pred=pred;

run;

and using the standard formula:

pred=1/(1+exp(-1*(Intercept+ (AGE_P*AGE)+(SEX_P*SEX)+(EDU_P*EDU)+(MRT_P*MRT);

Assuming Intercept, AGE_P, SEX_P, EDU_P, AND MRT_P are the coefficients provided by SAS, multiplied by the actual values per person.

The min. difference between the predicted probabilities produced by SAS and by hand is .00003, the max, .0815, and the mean .0219.

Thanks!

Emily

Hello,

I am trying to use the %str function to modify my code. Here it is:

%macro predictor;

data new;

set old;

pred=1/(1+exp(-1*(Intercept+ (AGE_P*AGE)+(SEX_P*SEX)+(EDU_P*EDU)

%if MRT=1 %then %str(

+ (MRT_P1))));

);

%else %if MRT=2 %then %str(

+ (MRT_P2))));

);

%else %str(

)));

);

run;

%mend predictor;

%predictor;

When I plug what's in the string functions individually, it works, so I know it doesn't have anything to do with how I coded the content (i.e. missing parentheses/semicolons). I've also used this function before in similar contexts, and never had a problem.

I'm getting the dreaded "There is no matching %IF statement for the %ELSE...A dummy macro will be compiled," error.

Any help would be appreciated.

Emily

All Replies
Solution
‎02-14-2018 02:14 PM
Super User
Posts: 23,357

## Re: SAS predicted probabilities vs. hand calculation

Posted in reply to epstewart1110

99.9% of time the error is on the user side. In this case, I suspect you're getting small rounding errors from possibly typing out your variables.

Either way, you can use the CODE statement in PROC LOGISTIC to see the code SAS would generate to calculate the predicted probabilities.

Or you can see this example that illustrates how to check it exactly:

https://communities.sas.com/t5/SAS-Statistical-Procedures/How-to-determine-logistic-regression-formu...

The Neuralgia2 data set is part of the PROC LOGISTIC Examples.

Oh and you probably want PARAM =REF on your CLASS statement.

epstewart1110 wrote:

Hi,

I have a follow up question to the original posted earlier today (listed below in bold).

How come there are slight differences in the predicted probabilities between those calculated by SAS:

proc logistic data=temp desc;

class EDU (ref='0') SEX (REF='0')  MRT(REF='0');

model outcome=AGE SEX EDU MRT;

output out=pp pred=pred;

run;

and using the standard formula:

pred=1/(1+exp(-1*(Intercept+ (AGE_P*AGE)+(SEX_P*SEX)+(EDU_P*EDU)+(MRT_P*MRT);

Assuming Intercept, AGE_P, SEX_P, EDU_P, AND MRT_P are the coefficients provided by SAS, multiplied by the actual values per person.

The min. difference between the predicted probabilities produced by SAS and by hand is .00003, the max, .0815, and the mean .0219.

Thanks!

Emily

Hello,

I am trying to use the %str function to modify my code. Here it is:

%macro predictor;

data new;

set old;

pred=1/(1+exp(-1*(Intercept+ (AGE_P*AGE)+(SEX_P*SEX)+(EDU_P*EDU)

%if MRT=1 %then %str(

+ (MRT_P1))));

);

%else %if MRT=2 %then %str(

+ (MRT_P2))));

);

%else %str(

)));

);

run;

%mend predictor;

%predictor;

When I plug what's in the string functions individually, it works, so I know it doesn't have anything to do with how I coded the content (i.e. missing parentheses/semicolons). I've also used this function before in similar contexts, and never had a problem.

I'm getting the dreaded "There is no matching %IF statement for the %ELSE...A dummy macro will be compiled," error.

Any help would be appreciated.

Emily

Contributor
Posts: 22

## Re: SAS predicted probabilities vs. hand calculation

Thanks Reeza!

I accounted for rounding errors by outputting the coefficients (outest=coefficients), and merging them with the predicted probabilities (pred=pred). I then applied the formula in SAS, so both the coefficients and predicted probabilities come from the same model and are unrounded.

I did, however, alter my model using param=ref:

proc logistic data=temp desc outest=coefficients;

class EDU (ref='0') SEX (REF='0')  MRT(REF='0') / param=ref;

model outcome=AGE SEX EDU MRT;

output out=pp pred=pred;

run;

Is this correct?

Despite still not identical, using param=ref greatly reduced discrepancies between SAS's predicted probabilities and those obtained by hand. The min. diff is now 0, the max, 0.0295, and the mean, 0.0097480.

There must be some reason for why they are still not exactly the same, but regardless, thanks for your help!

Emily

Super User
Posts: 23,357

## Re: SAS predicted probabilities vs. hand calculation

Posted in reply to epstewart1110

Post the code you used to do the newest comparison.

epstewart1110 wrote:

Thanks Reeza!

I accounted for rounding errors by outputting the coefficients (outest=coefficients), and merging them with the predicted probabilities (pred=pred). I then applied the formula in SAS, so both the coefficients and predicted probabilities come from the same model and are unrounded.

I did, however, alter my model using param=ref:

proc logistic data=temp desc outest=coefficients;

class EDU (ref='0') SEX (REF='0')  MRT(REF='0') / param=ref;

model outcome=AGE SEX EDU MRT;

output out=pp pred=pred;

run;

Is this correct?

Despite still not identical, using param=ref greatly reduced discrepancies between SAS's predicted probabilities and those obtained by hand. The min. diff is now 0, the max, 0.0295, and the mean, 0.0097480.

There must be some reason for why they are still not exactly the same, but regardless, thanks for your help!

Emily

Contributor
Posts: 22

## Re: SAS predicted probabilities vs. hand calculation

[ Edited ]

Originally, I was struggling with how to multiply the class level to the correct class dependent coefficient (in my first post). There are three levels for MRT, so I needed to multiply each one with the corresponding class-dependent coefficient.

I ended up doing this:

%macro predictor;

data new;

set old;

%if MRT=1 %then %do;

pred=1/(1+exp(-1*(Intercept+ (AGE_P*AGE)+(SEX_P*SEX)+(EDU_P*EDU)+(MRT*MRT_P1))));

%end;

%else %if MRT=2 %then %do;

pred=1/(1+exp(-1*(Intercept+ (AGE_P*AGE)+(SEX_P*SEX)+(EDU_P*EDU)+(MRT*MRT_P2))));

%end;

%else %do;

pred=1/(1+exp(-1*(Intercept+ (AGE_P*AGE)+(SEX_P*SEX)+(EDU_P*EDU))));

%end;

%mend predictor;

%predictor;

Astounding showed me a much more straightforward, simple way:

if mrt=1 then temp=mrt_p1;

else if mrt=2 then temp=mrt_p2;

else temp=0;

pred=1/(1+exp(-1*(Intercept+ (AGE_P*AGE)+(SEX_P*SEX)+(EDU_P*EDU) + (temp) )));

Once I did this, discrepancies between SAS predicted probabilities and those using the formula were minuscule.

Thanks Reeza!

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
• 4 replies
• 145 views
• 2 likes
• 2 in conversation