Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Stat Procs
- /
- how to use SAS to generate predicted probabilities of predictors in lo...

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 05-11-2018 05:25 PM
(15267 views)

Hi,

I have a question what is the correct way to calculate the predicted probabilities according to predictor levels in logistic regression using SAS.

The logistic regression model is as below:

outcome: success (binary, yes or no)

predictor: education level (binary, under or graduate)

control variables: age (age group) and gender

my SAS code:

(1) using logistic model to export the predicted probabilities of all observations on events="Yes"

proc logistic data=data;

class age gender;

model success(event="Yes")=age gender edu;

output out=pred p=p;

run;

(2) calculate the lsmeans of predicted probabilities for predictor using exported data

proc genmod data=pred;

class age gender;

model p=age gender edu;

lsmeans edu;

run;

In my opinion, in this way I can get the average predicted probabilities of each predictor level (under or graduate) after holding age and gender as constant.

But, I heard it is better to calculate predicted probabilities in STATA using the “marginal standardization” method

The STATA command is like:

margins edu, post

I compared the results in both ways, they are different, so I am wondering which way is better?

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

5 REPLIES 5

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Why not use lsmeans in proc logistic and compare those with stata estimates?

PG

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi PG,

I use proc genmod but not proc logistic because outcome variable "p" is continuous. I did compare the results difference between using SAS and STATA, they are different, so I am wondering which one is correct way.

Thanks

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

The LSMEANS statement does not necessarily compute predictive margins which use the marginal standardization method you mention. However, in the case where you want predictive margins for one variable while holding all other predictors at their means, then I think the LSMEANS statement can be used. But note that the LSMEANS statement can only be used for a model effect that is (or is made up of) a CLASS variable, and all CLASS variables must use the non-full rank GLM parameterization. If your Age variable is grouped as you indicate, then all this can be done when you fit your model in PROC LOGISTIC. Use the ILINK option if you want the estimates at each Age level to be on the probability scale rather than the logit (log odds) scale. The E option shows you the linear combination of model parameters that the LSMEANS statement computes. Note that options are available in the LSMEANS statement (particularly OM= and BYLEVEL) to alter the coefficients that are used for the CLASS predictors. Of course, you can always use the ESTIMATE statement to compute any desired (but estimable) linear combination of the parameters.

proc logistic data=data;

class age gender / param=glm;

model success(event="Yes")=age gender edu;

lsmeans age / ilink e;

run;

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thanks for your reply!

I tried to run this code with ilink option in SAS, I can get the predicted probabilities. However, the results are about 10% different (higher) from those generated by STATA using "Margins" command.

So I felt confused which one is the correct way to calculate predicted probabilities. Any comments are welcome!

Thanks again!

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. **Registration is now open through August 30th**. Visit the SAS Hackathon homepage.

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.