turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Multiple comparison in the unequal variance case u...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-02-2014 11:41 AM

I have found procedures such as Tamhanes T2, Dunnets T3 and the Games & Howell procedure that deal with unequal variances in the one-way model. However, I have a Randomized complete block design, which is basically a two-way model. And variances differ quite a lot in the treatment group. In SAS for mixed models 2nd ed. (p. 369) Ramon C. Litell et. al., uses a unequal variance model,

proc mixed data=TV ic;

class age sex;

model time=sex|age/DDFM=KR OUTP=R;

repeated / group=age;

lsmeans age sex / diff adjust=Tukey;

run;

However, I am not sure this is correct since the multiple comparison test (Tukey), uses a pooled estimate for the variance thus affecting p-values when the variances are unequal. Is this the correct way of performing multiple comparison under unequal variance? If yes, how does the Tukey adjustment handle the unequal variances?

Accepted Solutions

Solution

04-03-2014
08:40 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-03-2014 08:40 AM

Yes.

For your code, it would be:

proc mixed data=TV ic;

class age sex;

model time=sex|age/DDFM=KR OUTP=R;

repeated / group=age;

lsmeans age sex / diff adjust=simulate(seed=1) adjdfe=row;

run;

I picked seed=1 but any value could be inserted. You do want to specify a seed, so that different runs are identical.

Steve Denham

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-03-2014 08:10 AM

First point:

The model presented fits a separate variance for each age group, As Westfall et al. in *Multiple Comparisons and Multiple Tests Using SAS, 2nd ed.* say in Chapter 10, p 274 puts it "However, with extreme heteroscedasticity, Tukey's method can fail miserably." So, I wouldn't be using method=Tukey in this case.

Second point:

Westfall et al. point out that all methods are approximate under heteroscedastic variances. In Ch. 10, he gives several methods, and points out the pros and cons of each. I would suggest the MaxT adjustment.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-03-2014 08:15 AM

I look this up! Is the MaxT adjustment implemented in SAS?

Solution

04-03-2014
08:40 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-03-2014 08:40 AM

Yes.

For your code, it would be:

proc mixed data=TV ic;

class age sex;

model time=sex|age/DDFM=KR OUTP=R;

repeated / group=age;

lsmeans age sex / diff adjust=simulate(seed=1) adjdfe=row;

run;

I picked seed=1 but any value could be inserted. You do want to specify a seed, so that different runs are identical.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-03-2014 09:02 AM

I am still a bit uncertain as to what kind of test is performed under heteroscedasticity and how close to the nominal levels of the presented p-values we could get. Is the method well explained in *Multiple Comparisons and Multiple Tests Using SAS, 2nd ed.*? I might just pick up a copy to understand this method better!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-03-2014 09:35 AM

Personal opinion: I think Westfall's book is, or should be, required reading for anyone who works with designed experiments. There is a lot of theory mixed in, but in a way that makes it easier to understand what the code is doing. I think the method is well explained, and there are references given that relate to the performance as well as summaries of performance.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-03-2014 10:19 AM

Steve,

Would you recommend the sim option as a matter of SOP or are there disadvantages?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-03-2014 10:34 AM

We use it as our standard method. It grew out of the ERROR message you get when Dunnett'-Hsu adjustment (previous standard) failed to converge. SAS recommends ADJUST=SIMULATE in this case

The biggest disadvantages that we have seen are:

One-Time: an adequate number of simulations to give good results may run into long run times, especially if you are trying to meet an accuracy target, and you have a lot of endpoints.

Two-Because of the behavior of the seed and the RNG stream, it makes BY variable processing much trickier in the sense that getting identical results on separate machines is harder. We have a good method of macro looping that avoids this problem now, however.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-06-2014 02:36 PM

Steve,

I purchased the book and I am well on my way of working through it, it was, just as you said, a really good source when working with designed experiments. Thank you!

I have a one more question that I was hoping you could answer. I am currently working through the examples in the chapter we previously discussed and I was wondering if it would be appropriate to use the maxT/minP method when analysing experiments in a randomized complete block design.

I understand that the subset pivotality condtion is depedent on the hypotheses formed to get a strong control of the FWE. However, in the case where we might suspect heteroscedasticity among group levels of treatment could the single-step maxT or minP be the an appropriate solution? Are there any pitfalls that you are aware of?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-09-2014 09:08 AM

I haven't fallen into any pits yet, but I haven't had severe heteroskedasticity in any of our data. We do see enough that, as a standard in our mixed model analyses, we model it, and accept the risk of losing power when variances are, in fact, nearly homogeneous. The maxT, as implemented by method=sim, is our default. Hope this helps.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-12-2014 03:32 AM

Thank you for your answer!

I will try to find some simulation studies to determine power in these cases or conduct some myself!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-17-2014 06:48 PM

Hello Steve,

I was searching the SAS forums for a solution to the same problem. A question for you regarding your response to the original post:

Is it possible to include an interaction term in the model statement? I am working with a data set that has two factor variables, SPECIES (three levels) and PARASITISM_STATUS (two levels). The variance in the response variable is much larger in one level of parasitism_status than the other, even when transformed. Equality of variances is not a problem for "species" when the data is log10-transformed. I am interested in the interaction between these two factor variables, so my current model statement is "model y = spp parasitism_status spp*parasitism_status." Is there a way to include this type of interaction term in the code you provided above?

Many thanks!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-19-2014 12:41 PM

The code I provided used the shorthand expression

sex|age

which is identical to:

sex age sex*age;

So the interaction was included.

For your approach, I would consider using PROC GLIMMIX, doing something like:

proc glimmix data=yourdataset;

class spp parisitism_status;

model y = spp|parasitism_status /** link=log**; /* Note here y is not transformed prior to analysis */

random _residual_/group=parasitism_status;

lsmeans spp parasitism_status spp*parasitism_status/exp;

lsmestimates <these will compare the lsmeans of interest to address your study objectives, and will adjust for multiple comparisons>;

run;

The lsmestimate statement is one of the finest blades in the Swiss Army knife that is GLIMMIX. Rather than looking at all possible comparisons, as would the diff option in the lsmeans statement, you can narrow down to those of interest. This is particularly important when looking at repeated measurements in time, where you really aren't very interested in comparing the mean of group 2 at timepoint 3 to the mean of group 5 at timepoint 11. By eliminating comparisons such as these, the overall type I error can be controlled better without loss of power.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-21-2014 10:51 AM

Thank you so much for your reply, Steve! And thank you for drawing my attention to the lsmestimate statment in GLIMMIX. I'm just starting to learn how to use this platform in SAS.

How does the lsmestimate statement differ from the use of contrasts? Currently I have a GLIMMIX procedure coded for the same dataset I described, and I've used contrast statements to examine specific comparisons. Is this problematic?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-22-2014 10:29 AM

From the Shared Concepts section of the documentation:

In contrast to the linear functions that are constructed with the ESTIMATE statement, you do not specify coefficients for the individual parameter estimates. Instead, with the LSMESTIMATE statement you specify coefficients for the least squares means; these are then converted for you into estimable functions for the parameter estimates.

One could insert CONTRAST for ESTIMATE here.

There are some specifics to keep in mind. I would use a CONTRAST statement if I was comparing BLUPs for specific levels of random effects. However, it does not allow for the use of the AT option to get tests at specific values of a continuous covariate.

So, for me, an LSMESTIMATE statement provides a way to get a linear function of the lsmeans. That may be a contrast, or it could be an interesting function of any sort. Depending on the MODEL statement, it may be marginal or conditional with regards to the random effects.

Steve Denham