turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- PROC TTEST versus PROC GLM - CONTRAST

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-22-2013 02:20 PM

I'm running tests on a dataset where my CLASS variable has 9 levels. Two of the levels are very similar, and I want to determine whether they are significantly different from each other to see if they can actually be separated or whether they need to be combined.

When I run PROC TTEST restricted to these two levels, they are shown to be significantly different from each other (Pr > |t| is <0.0001) and (Pr > F is <0.0001).

When I run a PROC GLM modeling the same 'var' variable for these two levels, however, the CONTRAST statement returns a significant yet different result (Pr > F = 0.0407).

Should I expect TTEST and CONTRAST to find the same significance? Is it recommended to rely on one or the other when testing for a significant difference?

Also, if I run the full model (with an additional CLASS variable), the Pr > F in the CONTRAST output for the two levels increases to 0.3231. Would you think that the levels need to be significantly different from each other in this full model, or only in the more basic model? I realize that perhaps this decision may be left to the discretion of the modeler.

Accepted Solutions

Solution

05-23-2013
08:28 AM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to AJones

05-23-2013 08:28 AM

There really isn't a reason to believe that the contrast in GLM and the t test will give the same answer, as the data used are not the same. GLM uses all of the groups, and bases the contrast on the mean square error (MSE), under the assumption of homogeneity of variance. The other groups contribute to your knowledge of the estimated standard error.

Adding the additional CLASS variable removes an additional source of variation. One good example would be for a variable that had an additive gender effect. Adding gender as a class variable would reduce the variability estimate, but also would remove a major difference between levels of the groups.

In the end, the model ought to reflect the design (or in Walter Stroup's words: What would Fisher do?). Combining or not combining two levels is more than just a question of significance testing.

Steve Denham

All Replies

Solution

05-23-2013
08:28 AM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to AJones

05-23-2013 08:28 AM

There really isn't a reason to believe that the contrast in GLM and the t test will give the same answer, as the data used are not the same. GLM uses all of the groups, and bases the contrast on the mean square error (MSE), under the assumption of homogeneity of variance. The other groups contribute to your knowledge of the estimated standard error.

Adding the additional CLASS variable removes an additional source of variation. One good example would be for a variable that had an additive gender effect. Adding gender as a class variable would reduce the variability estimate, but also would remove a major difference between levels of the groups.

In the end, the model ought to reflect the design (or in Walter Stroup's words: What would Fisher do?). Combining or not combining two levels is more than just a question of significance testing.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to SteveDenham

05-23-2013 12:05 PM

Thank you. I thought I was restricting to the two levels in GLM by using a WHERE statement, but in fact I had not thought to delete the 7 extra zero placeholders in the CONTRAST statement. That must have been throwing it off, because the probability does indeed now match the TTEST.

What you say about the additional CLASS variable certainly makes sense.

As far as defining/combining variable levels, I agree that this should not be done post-experiment. In this case, I'm doing a meta-analysis and looking for trends to determine whether certain conditions are significant across studies--I am neutral to whether they are combined or not, but will certainly comment on it either way in my analysis.

I appreciate your thoughtful response and explanation!