I have a dataset that I am running with the following code. There are two treatments and two time points. Each animal has one of the treatments and both time points: before and after treatment. Each time point has replicates, though not the same number at each time point.
proc gLiMMix data = b initGLM initIter = 500 itDetails chol method = quad(qPoints = 10 qCheck qMax = 100 initPL = 500);
class animalID treatment time;
model soft = treatment|time / dist = lognormal solution cl covB(details);
random intercept time / subject = animalID type = un solution cl g;
run;
The interesting thing is that if I reorder the random variables to be:
random time intercept..
I do not get a valid objective function. If I switch to the default method rather than quad, it runs both ways; however, the degrees of freedom on the time variable when ordered "intercept time" (7) are half those when ordered "time intercept" (14). Maybe I just haven't thought about this long enough, but I don't understand why the syntactical order of the variables would matter. Is this only an issue with the intercept? Do I need to keep ordering in mind if I had other variables in the mix too?
Thanks!
Michael
(Windows 7 Professional, SAS 9.4)
Edit: This also happens with a log-transformed dependent variable and a normal distribution, as well as using proc mixed with the log-transformed dependent variable. In addition, for "time intercept" there are 0 degrees of freedom for the overall fixed intercept. It's almost as if when specified in this order, the random intercepts are using up all the degrees of freedom from the fixed intercept.
Message was edited by: Michael Cooney
I contacted tech support. It appears to be a quality of the containment method of df. Here's their full response:
Michael,
Thank you for contacting SAS Technical Support regarding your question on PROC MIXD / PROC GLIMMIX.
This is not a bug with PROC MIXED or PROC GLIMMIX.
For the issue of G matrix --
If you had used TYPE=VC not TYPE=UN, you would have seen the G matrix switched the order of elements, as you expected.
However, with TYPE=UN, you will have non-zero covariance, and things can be more complicated. When you specify different order of the random effects on your RANDOM statement, the Z matrices are different. The covariance in your data is estimated by V=ZGZ'+R. While V should be the same between the two specifications of your models (and so is the R matrix), G cannot be the same because of different Z matrix. In other words, the ZGZ' part is the same between the two specifications, but because Z matrix is different and G has a UN structure, the estimated elements in the G matrix would be different. I hope this makes sense.
For the issue of degrees of freedom --
The default degrees of freedom for your model is the containment method. A description of this method can be found in the documentation at the link below --
It says -
"The DDFM=CONTAIN method is carried out as follows: Denote the fixed effect in question as A and search the G-side random effect list for the effects that syntactically contain A. For example, the effect B(A) contains A, but the effect C does not, even if it has the same levels as B(A).
Among the random effects that contain A, compute their rank contributions to the matrix (in order). The denominator degrees of freedom that is assigned to A is the smallest of these rank contributions. If no effects are found, the denominator degrees of freedom for A is set equal to the residual degrees of freedom, . This choice of degrees of freedom is the same as for the tests performed for balanced split-plot designs and should be adequate for moderately unbalanced designs."
PROC GLIMMIX (and PROC MIXED) goes through all the random effects, for whom the rank contribution to the Z matrix are already computed.
Then the df for the fixed effect is the minimum of the ranks of the random effects that contain the fixed effect.
The rank contribution to Z matrix is computed through an algorithm called SWEEPING. This algorithm sweeps the columns of the Z'Z matrix from the top left corner to the lower right corner. When a column is linearly dependent on the columns swept already, that column doesn't count towards the rank. This happens to the INTERCEPT term when it is specified after TIME, so its rank contribution is 0. TIME actually got all the random contributions of 14.
If you use another ddfm method, such as ddfm=kr in both models, you would see consistent df estimations. This method does not depend on the order in which you specify your random effects.
For the issue of nonconvergence in PROC GLIMMIX -- You might use a PARMS statement to specify different set of starting values; or use different estimation method, or change your model.
The recommendation is to specify your random effects in an hierarchical order in PROC MIXED / PROC GLIMMIX for numerical considerations.
Hope this information is helpful. Thanks for using SAS! Please let me know if I misunderstood your question or if you have further questions.
Thanks,
Jill Tao
Principal Technical Support Statistician
This is a good heads up, Michael. I have never used that ordering, so it would have come as a major shock to the system. I think this rises to the level of "bug" and should really be addressed by Tech Support for a hot fix.
Steve Denham
I contacted tech support. It appears to be a quality of the containment method of df. Here's their full response:
Michael,
Thank you for contacting SAS Technical Support regarding your question on PROC MIXD / PROC GLIMMIX.
This is not a bug with PROC MIXED or PROC GLIMMIX.
For the issue of G matrix --
If you had used TYPE=VC not TYPE=UN, you would have seen the G matrix switched the order of elements, as you expected.
However, with TYPE=UN, you will have non-zero covariance, and things can be more complicated. When you specify different order of the random effects on your RANDOM statement, the Z matrices are different. The covariance in your data is estimated by V=ZGZ'+R. While V should be the same between the two specifications of your models (and so is the R matrix), G cannot be the same because of different Z matrix. In other words, the ZGZ' part is the same between the two specifications, but because Z matrix is different and G has a UN structure, the estimated elements in the G matrix would be different. I hope this makes sense.
For the issue of degrees of freedom --
The default degrees of freedom for your model is the containment method. A description of this method can be found in the documentation at the link below --
It says -
"The DDFM=CONTAIN method is carried out as follows: Denote the fixed effect in question as A and search the G-side random effect list for the effects that syntactically contain A. For example, the effect B(A) contains A, but the effect C does not, even if it has the same levels as B(A).
Among the random effects that contain A, compute their rank contributions to the matrix (in order). The denominator degrees of freedom that is assigned to A is the smallest of these rank contributions. If no effects are found, the denominator degrees of freedom for A is set equal to the residual degrees of freedom, . This choice of degrees of freedom is the same as for the tests performed for balanced split-plot designs and should be adequate for moderately unbalanced designs."
PROC GLIMMIX (and PROC MIXED) goes through all the random effects, for whom the rank contribution to the Z matrix are already computed.
Then the df for the fixed effect is the minimum of the ranks of the random effects that contain the fixed effect.
The rank contribution to Z matrix is computed through an algorithm called SWEEPING. This algorithm sweeps the columns of the Z'Z matrix from the top left corner to the lower right corner. When a column is linearly dependent on the columns swept already, that column doesn't count towards the rank. This happens to the INTERCEPT term when it is specified after TIME, so its rank contribution is 0. TIME actually got all the random contributions of 14.
If you use another ddfm method, such as ddfm=kr in both models, you would see consistent df estimations. This method does not depend on the order in which you specify your random effects.
For the issue of nonconvergence in PROC GLIMMIX -- You might use a PARMS statement to specify different set of starting values; or use different estimation method, or change your model.
The recommendation is to specify your random effects in an hierarchical order in PROC MIXED / PROC GLIMMIX for numerical considerations.
Hope this information is helpful. Thanks for using SAS! Please let me know if I misunderstood your question or if you have further questions.
Thanks,
Jill Tao
Principal Technical Support Statistician
I just wanted to praise Jill Tao for an outstanding explanation of sweeping and the order of random effects. Explanation of the role of Z and G in constructing V is also great. Many will benefit from this posting.
I gave it a "Like" and would encourage anyone else in this forum to do so as well. The moral of the story is "Theory counts." I had (wrongly) assumed that ZGZ' would necessarily be symmetric. It is not, under the UN parameterization.
Thank you, Jill.
Steve Denham
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.