BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
EricVanceMartin
Obsidian | Level 7

By default, MIXED gives two types of tests of the fixed effects, a t test and an F test. 

 

I can't say that I understand the differences between these (like when each might be useful/appropriate), but the p-values they produce are always identical, except...

 

In some cases when the fixed parameter estimate is very, very small, the t-test table will report SE=0, DF=0, t=., Pr>t=.

 

In this case, there is always a significance test for F in the Type 3 table.

 

First, can you help me understand the difference between these tests?

 

Second, is it OK for me to report the Type 3 results? Is it important that I note that this test was used? On the other hand, should I see this result as a red flag that something is wrong (and, if so, what?)? 

1 ACCEPTED SOLUTION

Accepted Solutions
StatsMan
SAS Super FREQ

You can find the formula for the t-statistic here .   The formula for C-hat can be found near the bottom of this page.  

 

When looking at s.e.'s of parameters that have very small estimates, you can get into a situation where the calculation of that s.e. just becomes too treacherous.  That's when you will see that 0 estimate for the s.e.

 

in the bigger picture, optimization algorithms perform better if the final parameter estimates for all parameters on the same scale.  If your parameter estimates are more than a few orders of magnitude apart, then you may want to rescale one of the offending variables to bring it's final parameter estimate more in line with the others in your model.

 

 

View solution in original post

9 REPLIES 9
SteveDenham
Jade | Level 19

The t tests you are referring to are those for the parameter estimates, I believe.  Recall that if you have multiple levels for classification variables, the reference level (usually the last) will have an estimate of 0 and all of the characteristics you list.  All of the other estimates are deviations from that reference level.  The global test of the null hypothesis that the means of all the levels of a factor are identical is the F test in the Type 3 summary, and those are the crucial p values, if you are basing your decisions on p values.

 

Steve Denham

EricVanceMartin
Obsidian | Level 7

I appreciate your reply. Yes, these are all tests of fixed effect parameter estimates. The predictors are all continuous rather than categorical, however, so this does not explain the problem. 

SteveDenham
Jade | Level 19

Hi Eric,

Can you share some of your code and output?  Ordinarily, I trust the Type 3 results over anything else, but this sounds like an interesting development, and I'd like to know more.

 

Steve Denham

EricVanceMartin
Obsidian | Level 7

Steve--I'm very glad for your attention.

 

I have had results like this in several different models/model development phases with this dataset. In this case, the "weird" predictor is an interaction, but I have seen this result with single-variable predictors as well. I have just been shrugging and saying, oh, for some reason, SAS can't calculate this test for this estimate--it's a pattern that when this happens, the estimate is very, very small--but I've assumed that the Type 3 test is equivalent, since it is for other predictors, but somehow deals with the small size better. 

 

My model development is based on deviance testing. However, I've noticed that almost always, a predictor that makes a significant contribution to model fit is also a significant predictor in the fixed solution. I mostly want to make sure that I'm not making a mistake by keeping predictors for which SAS chokes on the calculation of the t-test for the fixed effect--or maybe to find out that this is telling me something a more seasoned modeler would know.

 

In case it matters, the model is longitudinal. Though the time variable is NS in the fixed solution here, it begins the modeling process as significant. The addition of the L1 covariates renders it NS. The variance of the time slope is still significant even after the addition of the L1 predictors.

 

Here's the code. Variable names changed to protect the innocent. X for L1 covariates. Z for L2 covariates. (EDIT: I see the word "Time" is treated as special by the SAS code. That's not the name of the variable in the acual code.)

 

proc  mixed
      data   = LNCT
      method = ml
      covtest
      noclprint;
class FID;
model Lnbike_Pk = Time
                  X1
                  X2
                  X2^2
                  X3
                  X4
                  Time*X1
                  Z1
                  Z2
                  Z3
                  Z4
                  Z1*Z3
      /solution
      ddfm = SATTERTHWAITE;
random intercept
       Time
       /sub = FID
       type=un;
run;

Attaching an image of the fixed effects tables (second table is Type 3). Please let me know if the other results are relevant and I will post.

Screen Shot 2016-05-09 at 2.02.37 PM.png

 

lvm
Rhodochrosite | Level 12 lvm
Rhodochrosite | Level 12

Issue is probably related to Satterthwaite ddf method. What happens when you take out ddfm=satterth? Try changing this to ddfm=KR to see if you get the same answer, or try ddfm=bw.

 

You are using an unstructured covariance matrix. This can result in an invalid estimated covariance matrix, especially if you have very high correlation in your random-coefficient model. Try changing this to type=fa0(2). 

 

I am mostly giving you advice to see if the discrepancy remains under these different model specifications. This may be a guide. Also, have you tried fitting the same model in GLIMMIX?

EricVanceMartin
Obsidian | Level 7

Thank you lvm!

 

The change in covariance matrix gives somewhat higher variances that are otherwise similar in proportion and statistical significance.

 

The change in ddfm to KR had no apparent effect. With BW, the main fixed effects table for the troublesome interaction gives:

 

Est = 9.07E-10

SE = 0

DF = 465

t =Infty

p = <.0001

 

The estimate is identical with each of the df calculation methods. The F values and corresponding p values are also nearly identical (6 fewer df with ddfm=BW).

 

I have long known I was being rather liberal with the unstructured matrix. I changed to this during model development when I was having convergence problems and never went back.

 

Are there any assumptions/risks/cautions with the different methods of df calculation? I had read about this and though Satterthwaite was appropriate.

lvm
Rhodochrosite | Level 12 lvm
Rhodochrosite | Level 12

Your choice of df is fine (although I usually use KR as my first choice). I just wanted to know if this change affected your results (KR adjusts standard errors and df). I am not recommending bw or other methods. This was for testing purposes.

 

The type 3 F test for the interaction of continuous variables is based strictly on the parameter estimate and standard error in the solution table (actually the variance of the parameter estimate, the square of the SE). Thus, it is surprising that you get the difference. If there was a problem with the parameter estimate and SE, it would also show up in the type 3 F test. (If you put in an E option on the model statement, you will see that only the interaction parameter (with corresponding variance) is involved in the type 3 F test). I think the solution table result is partly based on the small coefficient coupled with a very large value for the product of the two continuous variables). You should try rescaling z1 or z2. In a data step, redefine z1 as z1/10000 (as an example). See what happens.

 

By the way, the df=0 result you originally posted is due to the Sattherwaite method. Problems with the estimated G matrix can cause problems with this method (when you switched to BW, you got more reasonable df). I would like to see the estimated G matrix with the original UN structure and with fa0(2) structure. Put in a G option on the random statement. (The Covariance matrix table with the fa0(2) structure does not give the G matrix directly).

plf515
Lapis Lazuli | Level 10

What's happening with the T-tests seems to be related to the SE being too small to be estimated. There's probably a way of changing the default, but a better method is probably to change the units of the variables that have such small parameter estimates. 

I think this is a problem similar to measuring human height in lightyears  or something.  Changing the units will also give you an easier time reading the output and give you more significant figures for the parameter estimate.

 

What's interesting is that the F test doesn't seem to be affected.  I found the formula for the F test here: https://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_mixed_sect0... but I didn't see the formula for the t tests anywhere. 

 

 

StatsMan
SAS Super FREQ

You can find the formula for the t-statistic here .   The formula for C-hat can be found near the bottom of this page.  

 

When looking at s.e.'s of parameters that have very small estimates, you can get into a situation where the calculation of that s.e. just becomes too treacherous.  That's when you will see that 0 estimate for the s.e.

 

in the bigger picture, optimization algorithms perform better if the final parameter estimates for all parameters on the same scale.  If your parameter estimates are more than a few orders of magnitude apart, then you may want to rescale one of the offending variables to bring it's final parameter estimate more in line with the others in your model.

 

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 9 replies
  • 22488 views
  • 2 likes
  • 5 in conversation