Thank you for clarifying your data set structure. I'm more sure now that we're on the same page.
Focusing on the covariate X: Because there are multiple (X,Y) observations (one for each level of Level2) within each level of Level3, we (or more properly, the statistical model) are able fit a regression of Y on X for each level of Level3; as you note, this set of regressions may have appreciable variance among intercepts, variance among slopes, and covariance between intercepts and slopes. These (co)variances are derived from the multiple Level3 regressions. Consequently, although you can assess whether there are random intercepts and random slopes, I'd say that assessment is "among" levels of Level3; there is no random intercept/slope among levels of Level2 because the model is using the different levels of Level2 (within each level of Level3) to define the regressions. I hope that make sense.
I failed to define "Xmean" and "Bmean". Xmean is the mean of the X values over the levels of Level2 for each level of Level3--it's like moving the X values up a tier, from Level2 to Level3, as if Xmean was measured at Level3. I hope that makes sense, too. This concept is addressed in the Singer paper (SES and MEANSES) I linked in an earlier response. Although I didn't intend them as centered variables, they certainly could be, and are in the Singer paper. If you center them correctly, both should be variable (i.e., not constant zero, although the mean would be zero). Should you center? Your call. If the model includes interactions (including polynomial terms, like X*X), then centering is very useful and potentially does reduce collinearity. In a model without interactions, it's less critical, I think. Centering doesn't hurt; you just have to rescale results to un-do centering if you want results on the original scales. Should you include Xmean and Bmean? Again, your call.
If it was me, because there are no covariates at Level1, I would compute the mean Y over the levels of Level1 for each level of Level2 within each level of Level3 and then use the mean Y as the response in the simpler, two-level model. Nothing wrong with an easier life 🙂 You would then be able to omit the second RANDOM statement. If the number of levels of Level1 are the same for all combinations of Level2 and Level3, then the statistical tests for fixed effects will be very similar, if not identical, to those from the three-level model. If the number of levels of Level1 varies dramatically among combinations of Level2 and Level3, then I might keep the three-level model.
I haven't looked in any detail at the paper you found with the macro for assessing assumptions. If you adequately understand how the macro is addressing assumptions, and know what the assumptions are and how to extract what you need from the MIXED procedure, you theoretically would be able to extend the methods to a three-level model. In a sense, your statistical model is a multiple regression in a mixed model, so you have all the assumptions associated with multiple regression plus the assumptions associated with a mixed model. A busy task, but not horribly difficult.
Good luck!
Thank you so much for your well-put answer @sld. Everything makes a lot of sense to me now.
Yes, I understand now why the assessment of random intercept/slope is "among" levels of Level3, not "among" levels of Level2. Thank you for explaining it clearly.
I can see that 'Xmean' has a nice interpretation if it's significant. Not sure if there is any other advantage of including it in the model, but I guess, I might keep it.
Reducing this model to a 2-level model will be wonderful! Could you please have a look at the attached data to see if I did the reduction right? I assume if it works my model will look like this
PROC MIXED data=SampleData covtest;
class Newlevel2;
model NewY= C* D* E* F* Xmean* X* B* / ddfm = SATTERTHWAITE s;
random intercept X*/ subject=Newlevel2 type=UN;
run;
I am not interested in including Bmean* and B* in the model and random statement respectively, unless they improve the fit or have some distinct advantages. I am sorry, but I am not sure I understood what you meant by ' If the number of levels of Level1 are the same for all combinations of Level2 and Level3' and 'If the number of levels of Level1 varies dramatically among combinations of Level2 and Level3'. Can you explain it a little bit?
If the idea of model reduction works, then I assume I can use the macro for testing assumptions written for 2-level models. That is a good macro, elaborate with explanations.
Thank you again @sld. You were a big help! Cheers!
I'm happy my response was coherent and, better yet, useful to you.
You get to choose how to build your model. If the number of observations is small relative to the number of predictor variables, then you may have to implement some form of variable selection to avoid overfitting. If the number of observations is large, then you might throw everything in (assuming no problems with multicollinearity, etc.) even if some predictor variables are not significant. Or you might want a parsimonious model.
If you are assessing the support for fixed effects factors, e.g., Xmean, and want to use an information criterion to decide, then remember that you must use a true maximum likelihood method (method=ml) rather than the default (method=reml) for the IC to be valid.
I'd try something like this (which I have not tested). It's close to what you have but includes everything apart from the kitchen sink, and the naming of things is different. Your new dataset looks fine.
/* Compute means over Level1 within Level2*Level3 */
proc sort data= SampleData;
by level3 level2;
run;
proc means data= SampleData noprint;
by level3 level2;
id c d e f x b;
var y;
output out=SampleData_means mean= ;
run;
/* Compute Xmean and Bmean (means of variables measured at Level2) */
proc means data=SampleData_means noprint;
by level3;
var x b;
output out=level3_means mean= xmean bmean;
run;
/* Merge means with data */
data SampleData_means_2;
merge SampleData_means level3_means;
by level3;
run;
/* Fit really busy, very full model */
proc mixed data=SampleData_means_2 covtest; class level3 ; model y = c d e f xmean bmean x b / ddfm=satterth s; random intercept x b / subject=level3 type=un; run;
Regarding unequal numbers of levels of Level1: Paraphrasing and quoting from a chapter from Carl Schwarz's online text
http://people.stat.sfu.ca/~cschwarz/Stat-650/Notes/PDFbigbook-SAS/SAS-part017.pdf :
If the number of levels of Level1 varies among levels of Level2, then levels of Level2 with more levels of Level1 will have smaller sampling uncertainty than levels of Level2 with fewer levels of Level1. Consequently, "process + sampling error is not constant over the entire regression line. Estimates are still unbiased, but not fully efficient." (p 1104) The analysis using means is approximate to the extent that the Level1 data are unbalanced across the levels of Level2. If the unbalance is less, then the approximation is "less approximate". Hoping that makes sense.
If you don't have a lot of levels for Level3, you may have estimation problems with type=UN. You could try UN(1), which sets the covariance to zero, then you just have random intercepts and random slopes. (My datasets rarely allow estimation of the covariance, they are just too small.)
Thanks, @sld for your further explanations and the sample code! My apology for getting back so late to you, life happened!
I have about 1000 level1 observations (Y) and 8 covariates. So I believe, I don't need to consider variable selection in my case.
Thank you for pointing out the estimation method (ML vs. REML). I will be assessing mainly the fixed effects, however, I am also interested to see if the random effects are significant. If I use ML instead of REML, do my significant random effects still make sense?
I incredibly appreciate the sample code you provided. I have tested it and it works! Thank you so much!
Yes, I understand now what you meant by unbalanced data.
Although I have a number of levels in level3, I often get my covariance between random intercept and slope as zero. I will try UN(1), thank you for pointing it out.
Finally a new question! How my interpretation would change if I reduce a 3-level model into a 2-level model? Specifically, we estimate the mean change in Y for a unit change in X in regression. Since my Y is already averaged over the levels of level2, do I interpret it as 'the mean change in Y at level2'?
You're welcome. I'm happy it's making sense.
Using AIC based on ML works for comparing models that differ in fixed and/or random effects. (AIC based on REML is good only for models that differ in random effects.) Variance estimates under REML are thought to be less biased than those under ML (although extent of bias depends on sample size), so sometimes you'll see people select a model using ML, then refit with REML. But if your sample size is adequate and you are not aiming for a parsimonious model, this may not matter much to you this time.
Regarding your question about interpretation: in this case (with no covariates measured at Level1), I don't see that moving from a 3-level model to a 2-level model by averaging over Level1 alters your interpretation; it just makes the model simpler.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.