Statistical Procedures

John_K · Posted 04-10-2019 08:35 AM

Hi Everyone!

I've been trying to learn more about crossed random effects modeling in SAS. Playing around with some data, I just haven't been able to resolve one annoying issue.

So imagine an experiment where the participants are presented with the same two letters over and over again (say 336 times), categorizing each letter as quickly as possible when they see it. I set up the models as :

PROC MIXED
DATA= work.csvfile ITDETAILS NAMELEN= 100 METHOD=REML;
class SubjectID Letter TrialID;  
MODEL RT = Letter/  SOLUTION  DDFM=sattherwaite;
RANDOM intercept/subject = SubjectID type = UN;
RANDOM intercept/subject = TrialID type = UN;
LSMEANS letter/ pdiff = all;
RUN;

The running man speeds off, and the model converges. Just before heading to the pub to celebrate, I notice that the degrees of freedom for the LSmeans estimates are in the ~60 range, while the DFs for the Type 3 tests are in the thousands. Is this really the case? This seems incongruent. Can I really have ~60 degrees of freedom to estimate individual means while simultaneously having ~5000 degrees of freedom to assess whether those means are different from each other? When there are more levels in the IV, adding random effects for them brings the Type 3 DFs in line with the LSMEAN estimates, but of course, that approach can fail to converge and isn’t possible to do in a situation like this where there are only two levels.

Adding a random intercept for trials nested in subjects or a repeated term for Letter (which shouldn't do anything in this model) produces the same outcome. If you have an insight into what might be going on or what I'm missing, I'd love to hear it. Thanks for taking the time.

PaigeMiller · Posted 04-10-2019 09:38 AM

Since you didn't show us the output, I will make only limited and general comments here.

I notice that the degrees of freedom for the LSmeans estimates are in the ~60 range, while the DFs for the Type 3 tests are in the thousands. Is this really the case?

Degrees of freedom for the LSMEANS are testing the hypothesis that estimate is equal to zero.This is done via a t-test and is usually the number of data points used, minus 1.

Degrees of freedom for the Type 3 tests are testing a different hypothesis, whether or not the estimates of all the different levels of a variable are all equal; this is usually done via an F-test, and involves the number of levels AND the number of replicate observations at each level.

So, yes, the LSMEANS test can have degrees of freedom around 60 and the Type 3 tests can have a DF in the thousands. I see nothing incongruous or obviously wrong here.

Lastly, despite the title of this post, I believe (can't prove it, but I certainly believe it) that SAS computes the correct degrees of freedom given the model specification. I say this because not only my experience, but the experience of 22 bazillion, 800 thousand and 42 real-life applications (and counting) seem to indicate that SAS gives the correct degrees of freedom. Usually, the error is specifying the model improperly rather than SAS computing the wrong degrees of freedom for the model specified.

--
Paige Miller

View solution in original post

PaigeMiller · Posted 04-10-2019 09:38 AM

Since you didn't show us the output, I will make only limited and general comments here.

I notice that the degrees of freedom for the LSmeans estimates are in the ~60 range, while the DFs for the Type 3 tests are in the thousands. Is this really the case?

Degrees of freedom for the LSMEANS are testing the hypothesis that estimate is equal to zero.This is done via a t-test and is usually the number of data points used, minus 1.

Degrees of freedom for the Type 3 tests are testing a different hypothesis, whether or not the estimates of all the different levels of a variable are all equal; this is usually done via an F-test, and involves the number of levels AND the number of replicate observations at each level.

So, yes, the LSMEANS test can have degrees of freedom around 60 and the Type 3 tests can have a DF in the thousands. I see nothing incongruous or obviously wrong here.

Lastly, despite the title of this post, I believe (can't prove it, but I certainly believe it) that SAS computes the correct degrees of freedom given the model specification. I say this because not only my experience, but the experience of 22 bazillion, 800 thousand and 42 real-life applications (and counting) seem to indicate that SAS gives the correct degrees of freedom. Usually, the error is specifying the model improperly rather than SAS computing the wrong degrees of freedom for the model specified.

--
Paige Miller

John_K · Posted 04-10-2019 10:12 AM

Thanks for taking the time to respond Paige! Output attached below.

Degrees of freedom for the LSMEANS are testing the hypothesis that estimate is equal to zero.This is done via a t-test and is usually the number of data points used, minus 1.

Degrees of freedom for the Type 3 tests are testing a different hypothesis, whether or not the estimates of all the different levels of a variable are all equal; this is usually done via an F-test, and involves the number of levels AND the number of replicate observations at each level.

So, yes, the LSMEANS test can have degrees of freedom around 60 and the Type 3 tests can have a DF in the thousands. I see nothing incongruous or obviously wrong here.

Gotcha. That makes sense. So relative to a dataset where each subject experiences each item once, the fact that each level is repeated 168 times is driving the rise in the degrees of freedom in the type 3 tests? That would be great as opposed to me not having correctly specified the dependency inherent in having the same two levels presented repeatedly.

Lastly, despite the title of this post, I believe (can't prove it, but I certainly believe it) that SAS computes the correct degrees of freedom given the model specification. I say this because not only my experience, but the experience of 22 bazillion, 800 thousand and 42 real-life applications (and counting) seem to indicate that SAS gives the correct degrees of freedom. Usually, the error is specifying the model improperly rather than SAS computing the wrong degrees of freedom for the model specified.

Definitely. I'm certain SAS is estimating the DFs correctly. The potential for my Incorrect specification of the model was what I had in mind.

Statistical Procedures

How to correctly determine degrees of freedom in MIXED for a simple crossed random effects model

Re: How to correctly determine degrees of freedom in MIXED for a simple crossed random effects model

Re: How to correctly determine degrees of freedom in MIXED for a simple crossed random effects model

Re: How to correctly determine degrees of freedom in MIXED for a simple crossed random effects model

User-friendly SAS application: mixed model analysis, prediction and mo...

Decimals in Degrees of freedom, proc mixed

Interpreting the random-effect solution in a mixed model

Degrees of freedom in proc Mixed

Re: Question on the selection of degrees of freedom for spline effects...

Follow Us

What is...

Statistical Procedures

Our biggest data and AI event of the year.

Follow Us

What is...