turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Discrepancy in the result of PROC MIXED by using e...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-27-2016 10:25 PM - edited 04-27-2016 10:32 PM

This is my first post here. Thanks in advance.

I try to get result of differences of least squares means by using PROC MIXED. The code is like this:

proc mixed data=dummy;

class A B C;

model Y = A B C D A*C A*D;

lsmeans A /pdiff cl;

estimate 'A2 vs A1' A -1 1 0 0;

estimate 'A2 vs A3' A 0 1 -1 0;

estimate 'A2 vs A4' A 0 1 0 -1;

estimate 'A3 vs A1' A -1 0 1 0;

estimate 'A4 vs A1' A -1 0 0 1;

estimate 'A2 vs A3' A 0 1 -1 0;

run;

Usually, the result of lsmeans statment and estimate statment should be consistent. But this time they are not:

Estimates

Label Estimate Standard Error DF t Value Pr > |t| Alpha Lower Upper

A2 vs A1 -55.1659 46.7075 375 -1.18 0.2383 0.05 -147.01 36.6755

A2 vs A3 19.0351 32.1103 375 0.59 0.5537 0.05 -44.1038 82.1739

A2 vs A4 -7.3886 35.7314 375 -0.21 0.8363 0.05 -77.6476 62.8705

A3 vs A1 -74.2009 45.4017 375 -1.63 0.1030 0.05 -163.47 15.0730

A4 vs A1 -47.7773 47.9488 375 -1.00 0.3197 0.05 -142.06 46.5050

A2 vs A3 19.0351 32.1103 375 0.59 0.5537 0.05 -44.1038 82.1739

Least Squares Means Effect

A Estimate SE DF t Value Pr > |t| Alpha Lower Upper

A1 30.4191 3.5455 375 8.58 <.0001 0.05 23.4475 37.3907

A2 23.5044 2.0461 375 11.49 <.0001 0.05 19.4812 27.5275

A3 26.0330 2.0424 375 12.75 <.0001 0.05 22.0169 30.0491

A4 25.6443 2.0548 375 12.48 <.0001 0.05 21.6039 29.6847

Differences of Least Squares Means Effect

A _A Estimate SE DF t Value Pr > |t| Alpha Lower Upper

A 1 2 6.9148 4.0733 375 1.70 0.0904 0.05 -1.0946 14.9241

A 1 3 4.3861 4.0647 375 1.08 0.2812 0.05 -3.6062 12.3785

A 1 4 4.7749 4.0879 375 1.17 0.2435 0.05 -3.2632 12.8129

A 2 3 -2.5286 2.8617 375 -0.88 0.3775 0.05 -8.1557 3.0985

A 2 4 -2.1399 2.8889 375 -0.74 0.4593 0.05 -7.8204 3.5406

A 3 4 0.3887 2.8827 375 0.13 0.8928 0.05 -5.2797 6.0571

The result of estimate is wired but the result of differences of least square means is resonable.

I think the problem is the variable "D". Because the assumption here is that D is not my primary interest so I don't include D in the class statement but it is included in the model statment. I have D and D*A in the model. And in the dataset, not each value of A has every value of D. For example, D has value 5,6,7,8,9,10. But when A=1, D only equals to 6,7,8,9,10. So I think this is an unbalanced dataset. So I do a test to make dummy value in D to match all the values in A. Then running the code above by using the new dataset. The results are matched this time.

My question is what the logical behind the estimate statment would lead its result to be different with that of lsmeans statment? Could somebody explain it to me? I just want to know that what impact the result of estimate statement and I think why the result of lsmeans is reasonable is that the lsmeans only esitimates the fixed variable that mentioned in class statement. Am I right? I attahced the dataset FYI.

Thanks!

Accepted Solutions

Solution

08-18-2016
05:33 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to heansy

04-28-2016 01:25 PM

The main thing here is that your estimate statements are not the differences between the lsmeans. Try adding the /e option to the LSMEANS statement. You will see that the estimable function includes much more than the parts you are including in your ESTIMATE statements, and that is the reason for the difference. D is handled as a continuous covariate, so the LSMEANS are the marginal values at the mean of D. Given that, it may be that even your LSmeans are a somewhat misleading, due to the imbalance. Get a copy of Littell et al.'s SAS for Mixed Models, 2nd. ed. and read the chapter on analysis of covariance. The LSMEANS should probably be calculated using the AT= option, in order to accommodate the D and A*D terms in the model.

Finally, if you are working in later versions of SAS/STAT, consider using the LSMESTIMATE statement to calculate differences between least squares means rather than the ESTIMATE statement. The syntax is much closer to what you are using. The only addition would be inclusion of the AT= option to accommodate the unequal slopes model that you are fitting.

Steve Denham

All Replies

Solution

08-18-2016
05:33 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to heansy

04-28-2016 01:25 PM

The main thing here is that your estimate statements are not the differences between the lsmeans. Try adding the /e option to the LSMEANS statement. You will see that the estimable function includes much more than the parts you are including in your ESTIMATE statements, and that is the reason for the difference. D is handled as a continuous covariate, so the LSMEANS are the marginal values at the mean of D. Given that, it may be that even your LSmeans are a somewhat misleading, due to the imbalance. Get a copy of Littell et al.'s SAS for Mixed Models, 2nd. ed. and read the chapter on analysis of covariance. The LSMEANS should probably be calculated using the AT= option, in order to accommodate the D and A*D terms in the model.

Finally, if you are working in later versions of SAS/STAT, consider using the LSMESTIMATE statement to calculate differences between least squares means rather than the ESTIMATE statement. The syntax is much closer to what you are using. The only addition would be inclusion of the AT= option to accommodate the unequal slopes model that you are fitting.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to SteveDenham

04-28-2016 06:12 PM

Steve is correct. But note, the LSMESTIMATE statement does not work with continuous covariates, just factors.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-29-2016 09:31 AM

But you can accommodate a continuous covariate by use of the AT option in the LSMESTIMATE statement. As long as there is one factor involved, you can add continuous covariates to your heart's content.

Steve Denham