Solved: Re: Lower order terms and interactions involving random effects in pro...

DIHS · Posted 06-18-2015 05:39 AM

Say you measure the size of individuals from a number of families, for each of these families some of the individuals are reared on high food, and some low food. I am interested in whether the families respond differently to food treatment (i.e the interaction). The code is:

proc mixed data=mylib.size covtest;

class family food;

model size = food;

random family family*food;

run;

I find that the covariance parameter estimate for family is 0 but family*food is significant. I also get the error ‘estimated G matrix is not positive definite’. If I remove family and keep family*food, family*food is significant and there are no error messages.

My question is: Can I have family*food in the random statement without family?

Many thanks.

Rick_SAS · Posted 06-18-2015 08:47 AM

The question of whether you can include an interaction term in a model without including the main terms has a long and stormy history. Can you include the interaction without the main effect? Yes, but many (most?) statisticians believe that you should include the main effects. It make interpretation easier. For a long discussion of this subject, see this CrossValidated discussion.

I'd vote "don't do it."

Incidentally, your example is essentially the Getting Started example for PROC MIXED, which uses data that all of us can run.

View solution in original post

Rick_SAS · Posted 06-18-2015 08:47 AM

The question of whether you can include an interaction term in a model without including the main terms has a long and stormy history. Can you include the interaction without the main effect? Yes, but many (most?) statisticians believe that you should include the main effects. It make interpretation easier. For a long discussion of this subject, see this CrossValidated discussion.

I'd vote "don't do it."

Incidentally, your example is essentially the Getting Started example for PROC MIXED, which uses data that all of us can run.

SteveDenham · Posted 06-18-2015 02:25 PM

Not often that I come down on the opposite of the fence from Rick, but this is going to be one of them.

I think there is a real difference between including main fixed effects in the face of interaction fixed effects and the same thing on the random side.

One can always "construct" fixed main effects from the interaction solution (Google "means model" Milliken, for example). However, for random effects, I am not so sure as I was when I started this post. I guess the key here would be to look at the degrees of freedom for the F tests of the fixed effects under the two specifications. Which reflects the "skeleton ANOVA" to steal a phrase from Walt Stroup? That is the parameterization I would trust first--and that Is the basis of Rick's "Don't do it" I suspect. However, as models become more complex having G matrix that isn't positive definite can run the risk of failure to converge, especially if you are working in PROC GLIMMIX with a conditional model due to distributional assumptions. In this case, I would carefully consider removing variance components that go to zero.

For the example in the OP post, the family*food variance component dominates the algorithm such that family has a zero estimate under REML. See what happens if you change the methodology to maximum likelihood (or use the NOBOUND option).

Steve Denham

lvm · Posted 06-18-2015 04:05 PM

Agree with Steve, overall. For the variance components model, it shouldn't make a difference (for most purposes) if you take out or leave in the main effect random term (assuming you don't use the NOBOUND option). As the model gets more complicated, the nonpositive definite G property can be quite problematic. Then you can take out the main effect for the random term. Model fitting be quite a bit easier when 0 variance terms are not included. If you are uncertain about denominator degrees of freedom for the different models, use ddfm=KR option on the model statement. This will typically make everything work out.

With NOBOUND, things are different. The 0 variance for the main effect may end up as a negative "variance". So, all the random terms are needed. This negative variance is nonsensical for a conditional interpretation of a model, but works fine for the marginal interpretation (i.e., as long as the TOTAL variance is positive).

Lower order terms and interactions involving random effects in proc mixed

Re: Lower order terms and interactions involving random effects in proc mixed

Re: Lower order terms and interactions involving random effects in proc mixed

Re: Lower order terms and interactions involving random effects in proc mixed

Re: Lower order terms and interactions involving random effects in proc mixed