BookmarkSubscribeRSS Feed
trekvana
Calcite | Level 5
hello all-

so we all know that when we are working with mixed effects models we have marginal and conditional residuals. what are the advantages of each of these residuals? for example:

when we are trying to evaluate the overall model should we work with the marginal residuals as these are the residuals representing the population?

if we are trying to evaluate how the model fits the sample then are the conditional residuals better?

i have read that when trying to asses the overall model for the mean model as well as the covariance model to use the marginal residuals.

any advice on this subject?
6 REPLIES 6
Paige
Quartz | Level 8
> so we all know that when we are working with mixed
> effects models we have marginal and conditional
> residuals.

We do?

This is a false statement. I don't know. Perhaps you could enlighten me.
trekvana
Calcite | Level 5
Paige-

When we specify mixed models we have fixed effects and random effects. The general equation for the model is Y=X*beta + Z*gamma + error. The betas are the fixed parameters and the random parameters. Now Z has to be a subset of X, namely the columns of Z are a subset of the columns of X. Also since the gamma are random they are distributed as N(0,G) where G is the covariance matrix of the gammas.

The interpretation of the beta parameters are regarded as population parameters and the gamma parameters are regarded as subject specific parameters. So for example lets say we were analyzing the effect of two treatments, a and b, and we randomly choose 10 clinics to select subjects for the study. Since the clinics are a random sample of all the clinics in the population we can regard the clinics as random effects. More to the point a simple model would look like:

Y=beta0+beta1*time+beta2*trt+gamma0_i + gamma1_i*time + error

So now each ith clinic adds their own intercept and time parameter to the model.

The conditional model is E(Y|gamma_i)=beta0+beta1*time+beta_2*trt+gamma0_i + gamma1_i*time. This model takes into account each clinic's involvement hence we are conditioning the Y given the gamma_i.

On the other hand the marginal model is E(Y)=beta0+beta1*time+beta2*trt. The random effect go away since in the marginal model we are averaging over all random effects and as was stated earlier the random effects have mean zero.

The residuals are the same as always. The predicted value Y - actual Y. But depending of whether you use the conditional or marginal model to predict Y we will be getting different values of the residuals.

Hopefully this helps. Let me know if I can clear anything up
Dale
Pyrite | Level 9
I agree with most of your description here regarding marginal and conditional models. However, it is not correct to state that "Z has to be a subset of X". There is no such requirement.

The X design matrix is constructed according to the variables named on the MODEL statement in PROC MIXED (expanded according to formatted values if a variable also appears on the CLASS statement). The Z design matrix is constructed according to the variables named on the RANDOM statement (also expanded according to formatted values if the variable is named on the CLASS statement). It is usually the case that variables named on the RANDOM statement are also named on the MODEL statement, but it is not a necessity.

It should be noted that effects for variables named on the RANDOM statement are assumed normally distributed with mean zero. If the variable is named on the MODEL statement, then the random effects are assumed normally distributed with mean as determined by the fixed effect estimate.
trekvana
Calcite | Level 5
Thanks for the correction Dale. I was thinking of the two stage random effects formulation where first we formulate Y_i=Z_i*beta_i + error_i and then model the beta_i according to some population parameters beta_i=A_i*beta + gamma_i.

Putting these two models together we have Y_i = (Z_i*A_i)*beta + Z_i*gamma_i + error_i

If we let X_i=Z_i*A_i then we are back to the mixed model formula with ONE exception. As quoted in Fitmaurice's Applied Longitudinal Analysis p203

"The two stage formulation requires that the design matrix for the fixed effects has the special structure X_i=Z_i*A_i where A_i contains only between subject (or time-invariant) covariates and Z_i contains only within-subject (or time-varying) covariates. This form of the design matrix for the fixed effects implies that any time-varying covariates must be specified as random effects to ensure their inclusion in the model for the population mean response."

More importantly it goes on to say "This constraint is unnecessary and, in many settings, it can be somewhat inconvenient."

Dale - as far as the initial question, do you think my reasoning as to which residuals to use is correct? Message was edited by: trekvana
Dale
Pyrite | Level 9
With regard to the original question, I really don't know which set of residuals to employ for what purposes. Sorry.
Paige
Quartz | Level 8
Thanks for the explanation.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 1324 views
  • 0 likes
  • 3 in conversation