Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- How to compute a std err for LSM differences?

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-05-2013 04:36 PM

Steve,

This is a (rather delayed) followup to a question that Steve Denham answered in September 2011. Please check my question "How to make LSM differences more useful in PROC GENMOD?"

My problem at that time was to figure out how to create a Least Squares Means Difference result from procedures when the link was not linear; in those cases (for example in PROC GLIMMIX, GENMOD, etc.) the ilink option result for the LS means is correct, but the LS means differences are wrong (ilink does not know that you cannot subtract 2 log quantites--the programming deficiency needs to be upgraded).

Your answer was quite clever (as usual, Steve!) but I have found that the std. error and confidence intervals produced by this method are much too large. To diagnose the problem, I ran a link=id model instead of a link=log model (in this case, the ilink gives the right LSM difference result,) so we can use that example to see where the problem lies. As you can see in the upper table, your method to calculate the Std. Error of the individual LSM estimates is exactly the same as the SAS output, but the pooled Std Err estimates are much higher (middle table), so the resulting confidence intervals are much wider than what SAS computes (lower table). The problem seems to be how to calculate the std err for the LSM difference--the pooled method adds the errors, but apparently we need some other method. Do you have any ideas?

Ron Levine

Least Squares Means (Output from model and Steve's Std Err calculation) | ||||||||

cohort | CombinedCompl | Estimate | StdErr (SAS) | StdErr (Steve) | DF | tValue | Probt | |

a_COMBO | 0 | 12.7795 | 0.7772 | 0.77722 | 153769 | 16.44 | <.0001 | |

a_COMBO | 1 | 19.5782 | 0.7786 | 0.77862 | 153769 | 25.14 | <.0001 | |

b_VALVE | 0 | 12.2007 | 0.7746 | 0.77462 | 153769 | 15.75 | <.0001 | |

b_VALVE | 1 | 18.375 | 0.7763 | 0.77634 | 153769 | 23.67 | <.0001 | |

c_CABG | 0 | 10.3971 | 0.7719 | 0.77191 | 153769 | 13.47 | <.0001 | |

c_CABG | 1 | 14.9921 | 0.7732 | 0.7732 | 153769 | 19.39 | <.0001 | |

Least Squares Means Differences (Steve method) | ||||||||

Obs | cohort | CombinedCompl | meandiff | poolstderr | DF | _95pctLCL | _95pctUCL | Probt |

1 | a_COMBO | 0 | 6.79872 | 1.10014 | 153769 | 4.64248 | 8.95496 | <.0001 |

2 | b_VALVE | 0 | 6.17431 | 1.0967 | 153769 | 4.02481 | 8.32381 | <.0001 |

3 | c_CABG | 0 | 4.59502 | 1.09256 | 153769 | 2.45363 | 6.73641 | <.0001 |

Least Squares Means Differences (SAS method) | ||||||||

Obs | cohort | CombinedCompl | Estimate | Standard Error | DF | _95pctLCL | _95pctUCL | Pr > |t| |

1 | a_COMBO | 0 | 6.7987 | 0.1435 | 153769 | 6.5175 | 7.08 | <.0001 |

2 | b_VALVE | 0 | 6.1743 | 0.1159 | 153769 | 5.9471 | 6.4016 | <.0001 |

3 | c_CABG | 0 | 4.595 | 0.06972 | 153769 | 4.4584 | 4.7317 | <.0001 |

Accepted Solutions

Solution

01-07-2013
05:47 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-07-2013 05:47 PM

Thanks 1zmm,

I appreciate Steve and 1zmm working through this problem. As many statisticians know, GENMOD, GLIMMIX etc., are great (potential) tools but if you can't extract a useful answer from the model, the technology is essentially useless. Maybe ilink will be fixed some day, but in the meantime your fix is very insightfull. I can get the difference in means by a simple subtraction; and now, having a way to compute the std err makes it possible to create correct confidence intervals--everything you need for reporting a result.

Thanks everyone.

Ron Levine

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-06-2013 01:34 PM

The method that Steve uses for calculating the standard errors of the differences in the least-square mean estimates within each of the three cohorts assumes that these estimates are statistically independent (or are uncorrelated). Apparently, however, these estimates are strongly negatively correlated to obtain such smaller estimates for the standard errors of these differences than that from Steve's method.

Therefore, you will have to specify the COV option on the LSMEANS statement to write to the output the variance-covariance matrix for the least-square mean estimates (I'm not sure whether this matrix can be written to a SAS data set for further manipulation using the ODS Table, LSMEANS). The variance of the difference between two least-square estimates within each cohort is the sum of the variance estimates and the two (identical) covariance estimates for these within-cohort least-square mean estimates; I would again presume that these latter covariance estimates are negative to obtain the small standard errors for the differences that you've gotten. The standard error of the difference between the two least-square estimates within each cohort is the square root of this variance of the difference.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-06-2013 05:12 PM

Hi 1zmm,

Adding COV to the LSM statement yields the following output:

cohort | Compl | Estimate | StdErr | DF | Lower | Upper | Cov1 | Cov2 | Cov3 | Cov4 | Cov5 | Cov6 |

a_COMBO | 0 | 12.7795 | 0.7772 | 153769 | 11.2561 | 14.3028 | 0.6041 | 0.5949 | 0.5945 | 0.5944 | 0.5942 | 0.5944 |

a_COMBO | 1 | 19.5782 | 0.7786 | 153769 | 18.0521 | 21.1043 | 0.5949 | 0.6063 | 0.5949 | 0.5948 | 0.5947 | 0.5949 |

b_VALVE | 0 | 12.2007 | 0.7746 | 153769 | 10.6824 | 13.719 | 0.5945 | 0.5949 | 0.6001 | 0.5947 | 0.5945 | 0.5945 |

b_VALVE | 1 | 18.375 | 0.7763 | 153769 | 16.8534 | 19.8966 | 0.5944 | 0.5948 | 0.5947 | 0.6027 | 0.5942 | 0.5944 |

c_CABG | 0 | 10.3971 | 0.7719 | 153769 | 8.8841 | 11.91 | 0.5942 | 0.5947 | 0.5945 | 0.5942 | 0.5959 | 0.5944 |

c_CABG | 1 | 14.9921 | 0.7732 | 153769 | 13.4766 | 16.5076 | 0.5944 | 0.5949 | 0.5945 | 0.5944 | 0.5944 | 0.5979 |

I'm not sure whether this is what you are looking for: I don't see a var/covar matrix here, and the covarianaces are all positive. Where do we go from here?

Ron Levine

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-06-2013 05:37 PM

Hi 1zmm,

I ran the CORR option to the LSM statement s well, and this does appear to show very high correlations among the LSM estimates; is this more helpful?

cohort | Compl | Corr1 | Corr2 | Corr3 | Corr4 | Corr5 | Corr6 |

a_COMBO | 0 | 1 | 0.983 | 0.9875 | 0.985 | 0.9904 | 0.989 |

a_COMBO | 1 | 0.983 | 1 | 0.9863 | 0.9841 | 0.9894 | 0.9882 |

b_VALVE | 0 | 0.9875 | 0.9863 | 1 | 0.9888 | 0.9942 | 0.9926 |

b_VALVE | 1 | 0.985 | 0.9841 | 0.9888 | 1 | 0.9916 | 0.9902 |

c_CABG | 0 | 0.9904 | 0.9894 | 0.9942 | 0.9916 | 1 | 0.9959 |

c_CABG | 1 | 0.989 | 0.9882 | 0.9926 | 0.9902 | 0.9959 | 1 |

Ron Levine

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-06-2013 06:33 PM

1zm,

From covariate matrix, according to your description, we have (for the COMBO cohort, for example):

sqrt(0.6041+0.6063+0.5949+0.5949) = 1.549. This is way off the SAS compution of 0.1435, so we still have some work to do.

Ron Levine

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-07-2013 07:36 AM

I was mistaken.about the formula for the variance of the DIFFERENCE between two least-square means (LSMs).

When you want to obtain the variance of the SUM of two LSMs, you should ADD the variances for each LSM AND the two (identical) covariances between the two LSMs:

VAR(LSM1 + LSM2) = VAR(LSM1) + VAR(LSM2) + 2*COVAR(LSM1, LSM2).

The standard error of the SUM of two LSMs is the square root of VAR(LSM1 + LSM2).

When you want to obtain the variance of the DIFFERENCE between the two LSMs (as you wanted), you should add the variances for each LSM and, from this sum, SUBTRACT the two (identical) covariances between the two LSMs:

VAR(LSM1 - LSM2) = VAR(LSM1) + VAR(LSM2) - 2*COVAR(LSM1, LSM2).

The standard error of the DIFFERENCE between two LSMs is the square root of VAR(LSM1 - LSM2).

For example, from the table of the variance-covariance matrix you show using the option, COV, in the LSMEANS statement,

the variance of the LSM for a_COMBO=0 equals 0.6041,

the variance of the LSM for a_COMBO=1 equals 0.6063, and

the covariance between these two LSMs equals 0.5949.

Thus, the variance of the DIFFERENCE between these two LSMs equals

0.6041 + 0.6063 - 2* 0.5949 = 0.0206.

The standard error of this DIFFERENCE equals the square root of this variance (0.0206) = 0.1435.

The variance of the DIFFERENCE between the LSM, b_VALVE=0, and the LSM, b_VALVE=1, equarls

0.6001 + 0.6027 - 2*0.5947 = 0.0134.

The standard error of this DIFFERENCE equals the square root of this variance (0.0134) = 0.1158.

And so on.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-07-2013 08:33 AM

Thanks for this discussion, guys. I admit that my method was crude, and definitely was based on the assumption of independence. I am incorporating 1zmm's methods in any future calculations of this type.

Steve Denham

Solution

01-07-2013
05:47 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-07-2013 05:47 PM

Thanks 1zmm,

I appreciate Steve and 1zmm working through this problem. As many statisticians know, GENMOD, GLIMMIX etc., are great (potential) tools but if you can't extract a useful answer from the model, the technology is essentially useless. Maybe ilink will be fixed some day, but in the meantime your fix is very insightfull. I can get the difference in means by a simple subtraction; and now, having a way to compute the std err makes it possible to create correct confidence intervals--everything you need for reporting a result.

Thanks everyone.

Ron Levine