I need help constructing a G matrix for the mixed model below. I am using the example in section 12.3.3 of the "Simulating Data with SAS" textbook as a reference so I can simulate a dataset.
The mixed model is as follows:
proc mixed data=sample method=reml;
class trt pt;
model followup = baseline trt time time2 time*trt time2*trt / solution outpm=outpm;
random intercept time/subject=pt;
ods select CovParms SolutionF;
ods output CovParms=CovParms SolutionF=SolutionF;
run;
Trt has 2 levels A and B, time is numeric and has a range 1-7 and included as both a fixed and random effect. Time2 is quadratic time. followup and baseline are scores at followup timepoints and at study startup. Pt is just ID number.
I am using the default covariance structure which is 'VC'.
The Covariance and Fixed effects estimates are in the tables below -
Covariance Parameter Estimates |
||
Cov Parm |
Subject |
Estimate |
Intercept |
pt |
265.65 |
time |
pt |
0.05513 |
Residual |
|
226.46 |
Solution for Fixed Effects |
||||||
Effect |
trt |
Estimate |
Standard |
DF |
t Value |
Pr > |t| |
Intercept |
|
44.5923 |
2.1326 |
580 |
20.91 |
<.0001 |
Baseline |
|
0.3187 |
0.03407 |
1349 |
9.35 |
<.0001 |
trt |
A |
9.4363 |
1.8590 |
1349 |
5.08 |
<.0001 |
trt |
B |
0 |
. |
. |
. |
. |
time |
|
0.5883 |
0.09601 |
508 |
6.13 |
<.0001 |
time2 |
|
-0.01128 |
0.001628 |
1349 |
-6.93 |
<.0001 |
time* trt |
A |
-0.5680 |
0.1281 |
1349 |
-4.43 |
<.0001 |
time* trt |
B |
0 |
. |
. |
. |
. |
time2* trt |
A |
0.007973 |
0.002190 |
1349 |
3.64 |
0.0003 |
time2* trt |
B |
0 |
. |
. |
. |
. |
Based on the model above, design matrix X has 10 columns, and Z has 2 columns.
I need some clarity on how to construct the G matrix (number of rows/columns and the values) when the covariance structure is VC. And, also perhaps comment on how G changes when the structure is "UN" (unstructured) or CS (compound symmetry).
UN:
Covariance Parameter Estimates |
||
Cov Parm |
Subject |
Estimate |
UN(1,1) |
pt |
308.72 |
UN(2,1) |
pt |
-1.8769 |
UN(2,2) |
pt |
0.08507 |
Residual |
|
220.02 |
CS:
Covariance Parameter Estimates |
||
Cov Parm |
Subject |
Estimate |
Variance |
pt |
225.59 |
CS |
pt |
43.1601 |
Residual |
|
247.97 |
With your RANDOM statement, the G matrix is 2x2. You have the diagonal elements (the UN(1,1) and UN (2,2) variances) and the off-diagonal element (the UN(2,1) covariance). What MIXED outputs with the G option is the complete G matrix for this model. Your Z matrix (the design matrix for the random effects) is Nx2, where N is the number of observations in your data. Taking Z*G*Z' gives you an NxN covariance matrix for the data. The V option on the RANDOM statement outputs the block of the V matrix for the 1st subject. You can output other blocks of V with V= on the RANDOM statement. The V matrix for this model is block-diagonal, so showing the blocks is sufficient to describe V.
It will be easier to explain how to get a G matrix from the solution if you include the G option after the slash in the RANDOM statement. It may even be obvious where each fit, at least for UN and CS. It is definitely more difficult for more complex variance covariance structures.
SteveDenham
What @SteveDenham said. Adding the G option to the RANDOM statement will show you the G matrix. If Z has 2 columns, then G will be 2x2. You can use the V option on the RANDOM statement to see the full covariance matrix for your data. If you have a SUBJECT= effect on your RANDOM statement, then MIXED prints the block of V for the first subject only. Without a SUBJECT= effect, you get the full covariance matrix. Be aware that this output will be very large if you have a large number of observations in your input data set.
You made the comment that TIME is both fixed and random in this model. That is not quite accurate. TIME is a fixed effect. You are fitting random adjustments to the intercept and slope on time for each level of PT. Does that make sense?
Sure. Make sure you have specified the G option in your RANDOM statement. Then ODS output can be used for this. You have:
ods output CovParms=CovParms SolutionF=SolutionF
Change this to:
ods output CovParms=CovParms SolutionF=SolutionF G=Gmatrix
and you will have a dataset called Gmatrix. NOTE: This may look unusual (lower triangular in long form or some such). If it looks right, you can use it directly in code for simulating MVN data (I hope that is correct, @Rick_SAS). The same applies to the V matrix - just be sure you have the V option specified for the RANDOM statement.
SteveDenham
@SteveDenham: Thanks for your quick responses and suggestions!
Estimated G and GCORR matrices for "VC and "UN" covariance structures from SAS output:
VC:
Estimated G Matrix |
||||
Row |
Effect |
Subject |
Col1 |
Col2 |
1 |
Intercept |
1 |
265.65 |
|
2 |
time |
1 |
|
0.05513 |
Estimated G Correlation Matrix |
||||
Row |
Effect |
Subject |
Col1 |
Col2 |
1 |
Intercept |
1 |
1.0000 |
|
2 |
time |
1 |
|
1.0000 |
UN:
Estimated G Matrix |
||||
Row |
Effect |
Subject |
Col1 |
Col2 |
1 |
Intercept |
1 |
308.72 |
-1.8769 |
2 |
time |
1 |
-1.8769 |
0.08507 |
Estimated G Correlation Matrix |
||||
Row |
Effect |
Subject |
Col1 |
Col2 |
1 |
Intercept |
1 |
1.0000 |
-0.3662 |
2 |
time |
1 |
-0.3662 |
1.0000 |
Just to reiterate, the goal is to simulate some "dummy" data based on the estimated parameters from the mixed model specified.
Thanks.
The G matrix gives you the variance/covariance matrix for the random adjustments to the intercept and slope on TIME for your model. For the uncorrelated case (TYPE=VC) you can use the variances on the diagonal elements to generate random normal variates (mean=0) and add those to your fixed effects of the intercept and slope on TIME. For the correlated case (TYPE=UN), things will be a bit trickier. You can plug the covariance matrix into PROC SIMNORMAL or into the RANDNORMAL function in SAS/IML, or go old-school and conditionally generate the correlated random variates in the DATA step. The section on HLM's in this SAS Global Forum paper will help with the uncorrelated case. The section on Repeated Measures models will help in understanding how to generate correlated data in IML/SIMNORMAL and then merging that data back in with a simulation.
With your RANDOM statement, the G matrix is 2x2. You have the diagonal elements (the UN(1,1) and UN (2,2) variances) and the off-diagonal element (the UN(2,1) covariance). What MIXED outputs with the G option is the complete G matrix for this model. Your Z matrix (the design matrix for the random effects) is Nx2, where N is the number of observations in your data. Taking Z*G*Z' gives you an NxN covariance matrix for the data. The V option on the RANDOM statement outputs the block of the V matrix for the 1st subject. You can output other blocks of V with V= on the RANDOM statement. The V matrix for this model is block-diagonal, so showing the blocks is sufficient to describe V.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.