Solved: Re: Construct G matrix for a mixed model with random effects

kc · Posted 04-17-2022 05:29 PM

I need help constructing a G matrix for the mixed model below. I am using the example in section 12.3.3 of the "Simulating Data with SAS" textbook as a reference so I can simulate a dataset.

The mixed model is as follows:

proc mixed data=sample method=reml;
class trt pt;
model followup = baseline trt time time2 time*trt time2*trt / solution outpm=outpm;
random intercept time/subject=pt;
ods select CovParms SolutionF;
ods output CovParms=CovParms SolutionF=SolutionF;
run;

Trt has 2 levels A and B, time is numeric and has a range 1-7 and included as both a fixed and random effect. Time2 is quadratic time. followup and baseline are scores at followup timepoints and at study startup. Pt is just ID number.

I am using the default covariance structure which is 'VC'.

The Covariance and Fixed effects estimates are in the tables below -

Covariance Parameter Estimates
Cov Parm	Subject	Estimate
Intercept	pt	265.65
time	pt	0.05513
Residual		226.46

Solution for Fixed Effects
Effect	trt	Estimate	Standard Error	DF	t Value	Pr > \|t\|
Intercept		44.5923	2.1326	580	20.91	<.0001
Baseline		0.3187	0.03407	1349	9.35	<.0001
trt	A	9.4363	1.8590	1349	5.08	<.0001
trt	B	0	.	.	.	.
time		0.5883	0.09601	508	6.13	<.0001
time2		-0.01128	0.001628	1349	-6.93	<.0001
*time trt**	A	-0.5680	0.1281	1349	-4.43	<.0001
*time trt**	B	0	.	.	.	.
*time2 trt**	A	0.007973	0.002190	1349	3.64	0.0003
*time2 trt**	B	0	.	.	.	.

Based on the model above, design matrix X has 10 columns, and Z has 2 columns.

I need some clarity on how to construct the G matrix (number of rows/columns and the values) when the covariance structure is VC. And, also perhaps comment on how G changes when the structure is "UN" (unstructured) or CS (compound symmetry).

UN:

Covariance Parameter Estimates
Cov Parm	Subject	Estimate
UN(1,1)	pt	308.72
UN(2,1)	pt	-1.8769
UN(2,2)	pt	0.08507
Residual		220.02

CS:

Covariance Parameter Estimates
Cov Parm	Subject	Estimate
Variance	pt	225.59
CS	pt	43.1601
Residual		247.97

StatsMan · Posted 04-19-2022 08:20 AM

With your RANDOM statement, the G matrix is 2x2. You have the diagonal elements (the UN(1,1) and UN (2,2) variances) and the off-diagonal element (the UN(2,1) covariance). What MIXED outputs with the G option is the complete G matrix for this model. Your Z matrix (the design matrix for the random effects) is Nx2, where N is the number of observations in your data. Taking Z*G*Z' gives you an NxN covariance matrix for the data. The V option on the RANDOM statement outputs the block of the V matrix for the 1st subject. You can output other blocks of V with V= on the RANDOM statement. The V matrix for this model is block-diagonal, so showing the blocks is sufficient to describe V.

View solution in original post

SteveDenham · Posted 04-18-2022 10:19 AM

It will be easier to explain how to get a G matrix from the solution if you include the G option after the slash in the RANDOM statement. It may even be obvious where each fit, at least for UN and CS. It is definitely more difficult for more complex variance covariance structures.

SteveDenham

StatsMan · Posted 04-18-2022 12:10 PM

What @SteveDenham said. Adding the G option to the RANDOM statement will show you the G matrix. If Z has 2 columns, then G will be 2x2. You can use the V option on the RANDOM statement to see the full covariance matrix for your data. If you have a SUBJECT= effect on your RANDOM statement, then MIXED prints the block of V for the first subject only. Without a SUBJECT= effect, you get the full covariance matrix. Be aware that this output will be very large if you have a large number of observations in your input data set.

You made the comment that TIME is both fixed and random in this model. That is not quite accurate. TIME is a fixed effect. You are fitting random adjustments to the intercept and slope on time for each level of PT. Does that make sense?

kc · Posted 04-18-2022 01:10 PM

@StatsMan: Yes, your correction about 'TIME' is the right interpretation from the model specified.

Is there a way to output G (and/or V) for all patients (depending on the covariance structure of course) in to a dataset for post-processing?

Thanks!

SteveDenham · Posted 04-18-2022 01:42 PM

Sure. Make sure you have specified the G option in your RANDOM statement. Then ODS output can be used for this. You have:

ods output CovParms=CovParms SolutionF=SolutionF

Change this to:

ods output CovParms=CovParms SolutionF=SolutionF G=Gmatrix

and you will have a dataset called Gmatrix. NOTE: This may look unusual (lower triangular in long form or some such). If it looks right, you can use it directly in code for simulating MVN data (I hope that is correct, @Rick_SAS). The same applies to the V matrix - just be sure you have the V option specified for the RANDOM statement.

SteveDenham

kc · Posted 04-18-2022 02:35 PM

@SteveDenham: Thanks for your quick responses and suggestions!

kc · Posted 04-18-2022 12:41 PM

Estimated G and GCORR matrices for "VC and "UN" covariance structures from SAS output:

VC:

Estimated G Matrix
Row	Effect	Subject	Col1	Col2
1	Intercept	1	265.65
2	time	1		0.05513

Estimated G Correlation Matrix
Row	Effect	Subject	Col1	Col2
1	Intercept	1	1.0000
2	time	1		1.0000

UN:

Estimated G Matrix
Row	Effect	Subject	Col1	Col2
1	Intercept	1	308.72	-1.8769
2	time	1	-1.8769	0.08507

Estimated G Correlation Matrix
Row	Effect	Subject	Col1	Col2
1	Intercept	1	1.0000	-0.3662
2	time	1	-0.3662	1.0000

Just to reiterate, the goal is to simulate some "dummy" data based on the estimated parameters from the mixed model specified.

Thanks.

StatsMan · Posted 04-18-2022 02:01 PM

The G matrix gives you the variance/covariance matrix for the random adjustments to the intercept and slope on TIME for your model. For the uncorrelated case (TYPE=VC) you can use the variances on the diagonal elements to generate random normal variates (mean=0) and add those to your fixed effects of the intercept and slope on TIME. For the correlated case (TYPE=UN), things will be a bit trickier. You can plug the covariance matrix into PROC SIMNORMAL or into the RANDNORMAL function in SAS/IML, or go old-school and conditionally generate the correlated random variates in the DATA step. The section on HLM's in this SAS Global Forum paper will help with the uncorrelated case. The section on Repeated Measures models will help in understanding how to generate correlated data in IML/SIMNORMAL and then merging that data back in with a simulation.

kc · Posted 04-18-2022 02:35 PM

@StatsMan: Thanks for sharing the paper. I was able to simulate data for TYPE=VC. I will report back after implementing data simulation using TYPE=UN.

One last issue is that I still see values only for the 1st patient in the G matrix when using TYPE=UN, output via ODS statement. Hope I am not missing something here conceptually.

StatsMan · Posted 04-19-2022 08:20 AM

With your RANDOM statement, the G matrix is 2x2. You have the diagonal elements (the UN(1,1) and UN (2,2) variances) and the off-diagonal element (the UN(2,1) covariance). What MIXED outputs with the G option is the complete G matrix for this model. Your Z matrix (the design matrix for the random effects) is Nx2, where N is the number of observations in your data. Taking Z*G*Z' gives you an NxN covariance matrix for the data. The V option on the RANDOM statement outputs the block of the V matrix for the 1st subject. You can output other blocks of V with V= on the RANDOM statement. The V matrix for this model is block-diagonal, so showing the blocks is sufficient to describe V.

SteveDenham · Posted 04-19-2022 07:24 AM

Great answer @StatsMan . I had forgotten all about PROC SIMNORMAL.

SteveDenham

Ready to join fellow brilliant minds for the SAS Hackathon?