Hi, folks-
Sorry for such a basic question, but what scale are estimated means on in an LSMEANS statement (PROC LOGISTIC). Also, what does ILINK do?
In my example here, I can't see that ILINK is doing anything. I estimate a logit model with an LSMEANS statement, then estimate the same model again with the "ILINK" and "DIFF" options in LSMEANS. Here's the first model:
proc logistic data=data.climate23; class healthharm_rc (ref="No") housing_instab (ref="Very Stable") workstatus (ref="Full time") depressed_pm (ref="None/Never") trans (ref=FIRST) / order=internal param=glm; model healthharm_rc (event="Yes") = trans housing_instab depressed_pm hate_witness_gen workstatus; lsmeans housing_instab / diff=all ilink or; format trans trans. healthharm_rc hate_witness_gen yesno_alt. housing_instab housing_instab. depressed_pm menthealth_pm.; run;
And here's the output:
If I'm understanding this correctly, in the first table, the effects are on the on the logit scale (i.e., in terms of a logged odds ratio). The second table's "Estimate" column gives the linear differences, also on the logit scale, between each pair of categories in "housing_instab". For example, the first row of the second table, ".03161", is obtained simply by substracting the coefficient associated with "Somewhat Stable" from that associated with "Fairly Stable": .4307 - .3990 = .03161. So far, so good.
The second model just adds the "ILINK" and "OR" options to the LSMEANS statement:
proc logistic data=data.climate23; *weight fnwgt0; class healthharm_rc (ref="No") housing_instab (ref="Very Stable") workstatus (ref="Full time") depressed_pm (ref="None/Never") trans (ref=FIRST) / order=internal param=glm; model healthharm_rc (event="Yes") = trans housing_instab depressed_pm hate_witness_gen workstatus; lsmeans housing_instab / diff=all ilink or; format trans trans. healthharm_rc hate_witness_gen yesno_alt. housing_instab housing_instab. depressed_pm menthealth_pm.; run;
Here are the corresponding tables:
And here are my questions:
1) In the first table, what are the values in the "Mean" column?
2) If the values in the "Mean" column are, in fact, least squares estimates of the coefficients for each level of "housing_instab", expressed in terms of logged odds (logit scale), what is the "Estimate" column?
3) The column "Odds Ratio" in the second table was added by the "OR" option and is simply the linear difference (third column) exponentiated, right?
4) What is the ILINK option doing?
The documentation for PROC LOGISTIC says ILINK "requests that estimates and their standard errors in the "Least Squares Means" table also be reported on the scale of the mean. This enables you to obtain estimates of predicted probabilities and their standard errors ..." But I can't see that any of the numbers in the tables are predicted probabilities.
I'm guessing that I'm somehow wrong, but could someone throw me a bone here?
Thanks,
David
Values in the Estimate column of the first LS-means table are on the logit (log odds) scale. In the second, differences table, they are differences of log odds which are log odds ratios. When you add the ILINK option, the values in the Mean column are on the event probability scale. The Estimate column is still on the log odds scale. With the ODDSRATIO option, the Odds Ratio column in the second, differences table are the odds ratio estimates - just exponentiating the differences of the log odds.
Values in the Estimate column of the first LS-means table are on the logit (log odds) scale. In the second, differences table, they are differences of log odds which are log odds ratios. When you add the ILINK option, the values in the Mean column are on the event probability scale. The Estimate column is still on the log odds scale. With the ODDSRATIO option, the Odds Ratio column in the second, differences table are the odds ratio estimates - just exponentiating the differences of the log odds.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.