Solved: how to use SAS to generate predicted probabilities of predictors in lo...

joe66 · Posted 05-11-2018 05:25 PM

Hi,

I have a question what is the correct way to calculate the predicted probabilities according to predictor levels in logistic regression using SAS.

The logistic regression model is as below:

outcome: success (binary, yes or no)

predictor: education level (binary, under or graduate)

control variables: age (age group) and gender

my SAS code:

(1) using logistic model to export the predicted probabilities of all observations on events="Yes"

proc logistic data=data;

class age gender;

model success(event="Yes")=age gender edu;

output out=pred p=p;

run;

(2) calculate the lsmeans of predicted probabilities for predictor using exported data

proc genmod data=pred;

class age gender;

model p=age gender edu;

lsmeans edu;

run;

In my opinion, in this way I can get the average predicted probabilities of each predictor level (under or graduate) after holding age and gender as constant.

But, I heard it is better to calculate predicted probabilities in STATA using the “marginal standardization” method

The STATA command is like:

margins edu, post

I compared the results in both ways, they are different, so I am wondering which way is better?

Thanks

StatDave · Posted 03-15-2021 10:28 AM

Predictive margins and LS-means are not the same in general. LS-means are linear combinations of model parameters. Margins are averages of predicted values. They are the same only when margins are computed with all other predictors fixed. Typically, only the variable for which the margins are computed is fixed and the other predictors vary as observed when the averaging is done. If you want margins rather than LS-means, then use the Margins macro.

View solution in original post

PGStats · Posted 05-12-2018 12:28 AM

Why not use lsmeans in proc logistic and compare those with stata estimates?

PG

joe66 · Posted 05-14-2018 02:50 AM

Hi PG,

I use proc genmod but not proc logistic because outcome variable "p" is continuous. I did compare the results difference between using SAS and STATA, they are different, so I am wondering which one is correct way.

Thanks

StatDave · Posted 05-14-2018 10:39 AM

The LSMEANS statement does not necessarily compute predictive margins which use the marginal standardization method you mention. However, in the case where you want predictive margins for one variable while holding all other predictors at their means, then I think the LSMEANS statement can be used. But note that the LSMEANS statement can only be used for a model effect that is (or is made up of) a CLASS variable, and all CLASS variables must use the non-full rank GLM parameterization. If your Age variable is grouped as you indicate, then all this can be done when you fit your model in PROC LOGISTIC. Use the ILINK option if you want the estimates at each Age level to be on the probability scale rather than the logit (log odds) scale. The E option shows you the linear combination of model parameters that the LSMEANS statement computes. Note that options are available in the LSMEANS statement (particularly OM= and BYLEVEL) to alter the coefficients that are used for the CLASS predictors. Of course, you can always use the ESTIMATE statement to compute any desired (but estimable) linear combination of the parameters.

proc logistic data=data;

class age gender / param=glm;

model success(event="Yes")=age gender edu;

lsmeans age / ilink e;

run;

joe66 · Posted 05-17-2018 03:55 PM

Thanks for your reply!

I tried to run this code with ilink option in SAS, I can get the predicted probabilities. However, the results are about 10% different (higher) from those generated by STATA using "Margins" command.

So I felt confused which one is the correct way to calculate predicted probabilities. Any comments are welcome!

Thanks again!

StatDave · Posted 03-15-2021 10:28 AM

Predictive margins and LS-means are not the same in general. LS-means are linear combinations of model parameters. Margins are averages of predicted values. They are the same only when margins are computed with all other predictors fixed. Typically, only the variable for which the margins are computed is fixed and the other predictors vary as observed when the averaging is done. If you want margins rather than LS-means, then use the Margins macro.

how to use SAS to generate predicted probabilities of predictors in logistic regresion

Re: how to use SAS to generate predicted probabilities of predictors in logistic regresion

Re: how to use SAS to generate predicted probabilities of predictors in logistic regresion

Re: how to use SAS to generate predicted probabilities of predictors in logistic regresion

Re: how to use SAS to generate predicted probabilities of predictors in logistic regresion

Re: how to use SAS to generate predicted probabilities of predictors in logistic regresion

Re: how to use SAS to generate predicted probabilities of predictors in logistic regresion

Catch up on SAS Innovate 2026