I'm trying to validate the shrinkage that gets applied in the estimation of random effects solutions. (I fit a pretty basic random effects only model, no fixed effects, just intercepts). However, my estimates look like they're a little off. Can anyone provide any resources or help validating?
Update: Here's the code I used to run the procedure:
proc glimmix data= data6;
by rating_group;
class duration;
model relativity = offset /solution noint; /*the offset variable is just 1 for all entries - I was getting errors without that fix*/
random intercept / subject= duration solution;
run;
Below are the parameter results I'm seeing:
Random Effect Level | Parameter estimated by Hand | From Model |
1 | 1.20 | 1.20 |
2 | 1.22 | 1.29 |
3 | 1.21 | 1.24 |
4 | 1.06 | 1.12 |
5 | 1.17 | 1.17 |
6 | 1.07 | 1.12 |
7 | 1.02 | 1.09 |
8 | 0.98 | 1.03 |
9 | 1.02 | 1.09 |
10 | 0.96 | 1.02 |
11 | 0.97 | 1.00 |
Other notes - I used the formula
alpha = Z * xbar + (1-Z) * mu
with z = V1/(V1 + V2)
where V1 is the variance of level i of the random factor, V2 is the variance of the random effects.
Still more notes:
I calculated V1 as the variance (from proc means) divided by the number of observations in that random effect level
I calculated V2 as the variance of the means of each random effect
I'm just starting to learn about random effects models, so any help or resources would be greatly appreciated! Thanks!
You did not mention which procedure you are using to get the random effects. As a minimum it will help have the code you used for that posted.
You can get differences when using summaries from other procedures because most of the modeling procedures will not use any of the records with missing values for any of the variables on a model statement by default. Proc Means/Summary/Univariate would use all values including from the observations excluded by the model. So your by hand calculation may be based on different records. Check the summary table to see if the numbers of observations used in the model match for you proc means. If the don't you would want to rerun your proc means insuring that only records with all variables present are used in the summaries. A where statement similar to:
where not missing(var1) and not missing(var2) and not missing(var3);
in you proc means code might get closer by hand calculations if observations were excluded in your modeling proc.
Thanks for the response!
Unfortunately, I think the dataset is pretty clean and tidy - I cleaned up any missing or other problem values before running. The counts from proc means match the counts from proc glimmix. I'm thinking the problem may be related to my math/lack of understanding of random models if anything. I updated my post to include the procedure that I used.
Thanks again!!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.