BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
John_K
Calcite | Level 5

Hi Everyone!

I've been trying to learn more about crossed random effects modeling in SAS. Playing around with some data, I just haven't been able to resolve one annoying issue. 

 

So imagine an experiment where the participants are presented with the same two letters over and over again (say 336 times), categorizing each letter as quickly as possible when they see it.  I set up the models as :

 

PROC MIXED
DATA= work.csvfile ITDETAILS NAMELEN= 100 METHOD=REML;
class SubjectID Letter TrialID;  
MODEL RT = Letter/  SOLUTION  DDFM=sattherwaite;
RANDOM intercept/subject = SubjectID type = UN;
RANDOM intercept/subject = TrialID type = UN;
LSMEANS letter/ pdiff = all;
RUN;

 The running man speeds off, and the model converges. Just before heading to the pub to celebrate, I notice that the degrees of freedom for the LSmeans estimates are in the ~60 range, while the DFs for the Type 3 tests are in the thousands. Is this really the case? This seems incongruent. Can I really have ~60 degrees of freedom to estimate individual means while simultaneously having ~5000 degrees of freedom to assess whether those means are different from each other? When there are more levels in the IV, adding random effects for them brings the Type 3 DFs in line with the LSMEAN estimates, but of course, that approach can fail to converge and isn’t possible to do in a situation like this where there are only two levels.

 

Adding a random intercept for trials nested in subjects or a repeated term for Letter (which shouldn't do anything in this model) produces the same outcome. If you have an insight into what might be going on or what I'm missing, I'd love to hear it. Thanks for taking the time.

 

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

Since you didn't show us the output, I will make only limited and general comments here.

 

I notice that the degrees of freedom for the LSmeans estimates are in the ~60 range, while the DFs for the Type 3 tests are in the thousands. Is this really the case?

 

Degrees of freedom for the LSMEANS are testing the hypothesis that estimate is equal to zero.This is done via a t-test and is usually the number of data points used, minus 1.

 

Degrees of freedom for the Type 3 tests are testing a different hypothesis, whether or not the estimates of all the different levels of a variable are all equal; this is usually done via an F-test, and involves the number of levels AND the number of replicate observations at each level.

 

So, yes, the LSMEANS test can have degrees of freedom around 60 and the Type 3 tests can have a DF in the thousands. I see nothing incongruous or obviously wrong here.

 

Lastly, despite the title of this post, I believe (can't prove it, but I certainly believe it) that SAS computes the correct degrees of freedom given the model specification. I say this because not only my experience, but the experience of 22 bazillion, 800 thousand and 42 real-life applications (and counting) seem to indicate that SAS gives the correct degrees of freedom. Usually, the error is specifying the model improperly rather than SAS computing the wrong degrees of freedom for the model specified.

--
Paige Miller

View solution in original post

2 REPLIES 2
PaigeMiller
Diamond | Level 26

Since you didn't show us the output, I will make only limited and general comments here.

 

I notice that the degrees of freedom for the LSmeans estimates are in the ~60 range, while the DFs for the Type 3 tests are in the thousands. Is this really the case?

 

Degrees of freedom for the LSMEANS are testing the hypothesis that estimate is equal to zero.This is done via a t-test and is usually the number of data points used, minus 1.

 

Degrees of freedom for the Type 3 tests are testing a different hypothesis, whether or not the estimates of all the different levels of a variable are all equal; this is usually done via an F-test, and involves the number of levels AND the number of replicate observations at each level.

 

So, yes, the LSMEANS test can have degrees of freedom around 60 and the Type 3 tests can have a DF in the thousands. I see nothing incongruous or obviously wrong here.

 

Lastly, despite the title of this post, I believe (can't prove it, but I certainly believe it) that SAS computes the correct degrees of freedom given the model specification. I say this because not only my experience, but the experience of 22 bazillion, 800 thousand and 42 real-life applications (and counting) seem to indicate that SAS gives the correct degrees of freedom. Usually, the error is specifying the model improperly rather than SAS computing the wrong degrees of freedom for the model specified.

--
Paige Miller
John_K
Calcite | Level 5

Thanks for taking the time to respond Paige! Output attached below. 

Degrees of freedom for the LSMEANS are testing the hypothesis that estimate is equal to zero.This is done via a t-test and is usually the number of data points used, minus 1.

 

Degrees of freedom for the Type 3 tests are testing a different hypothesis, whether or not the estimates of all the different levels of a variable are all equal; this is usually done via an F-test, and involves the number of levels AND the number of replicate observations at each level.

 

So, yes, the LSMEANS test can have degrees of freedom around 60 and the Type 3 tests can have a DF in the thousands. I see nothing incongruous or obviously wrong here.

 

Gotcha. That makes sense.  So relative to a dataset where each subject experiences each item once, the fact that each level is repeated 168 times is driving the rise in the degrees of freedom in the type 3 tests? That would be great as opposed to me not having correctly specified the dependency inherent in having the same two levels presented repeatedly. 

Lastly, despite the title of this post, I believe (can't prove it, but I certainly believe it) that SAS computes the correct degrees of freedom given the model specification. I say this because not only my experience, but the experience of 22 bazillion, 800 thousand and 42 real-life applications (and counting) seem to indicate that SAS gives the correct degrees of freedom. Usually, the error is specifying the model improperly rather than SAS computing the wrong degrees of freedom for the model specified.

 

Definitely. I'm certain SAS is estimating the DFs correctly. The potential for my Incorrect specification of the model was what I had in mind.


Output.png

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 586 views
  • 0 likes
  • 2 in conversation