BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
stats2554
Calcite | Level 5
I’m using proc mixed with treatment, time, treatment x time as factors, time as repeated. Comparisons using dunnetts between three treated groups to a control group (these are the comparisons that are reported). I need to produce a value for RMSE and I’m wondering the most efficient way to do this. Thanks!
1 ACCEPTED SOLUTION

Accepted Solutions
StatsMan
SAS Super FREQ

With the REPEATED statement, you are partitioning the R matrix into variance and covariance parameters. Using TYPE=CS, there are two components. The CS component is the common covariance between all observations on a give subject. The RESIDUAL component is the diagonal enhancement to the R matrix. Try adding the R option to the REPEATED statement and you can see what the R matrix looks like for the first subject.

 

Given that you are partitioning the variance into components with the REPEATED statement, you no longer have a single term that is the equivalent of the MSE. If your data are balanced (same number of obs on each subject and for each level of your CLASS effects), then some would advocate reporting the sum of these two components as the MSE. 

 

If the data are not balanced or if you are using a more complicated covariance structure on the REPEATED statement (some other value than CS on TYPE=), then you have partitioned the residual variance in a more complicated fashion and it may be difficult to impossible to come up with a term that corresponds to the MSE. 

View solution in original post

7 REPLIES 7
jiltao
SAS Super FREQ

The concept of RMSE might not apply to models fit in PROC MIXED.  What is your PROC MIXED program for your repeated measures data and what is the output for the covariance parameter estimates table?

Jill

stats2554
Calcite | Level 5

This is my program and covariance parameter estimates.  Is there a similar estimate?

 

proc mixed data=Result;
class subject period treatment time;
model result = baseline period subject treatment time treatment*time;
repeated time / type=CS subject=subject*period
lsmeans treatment / adjust=dunnett;

 

Covariance parameter estimates

Cov Par    Subject                 Estimate

CS             subject*period    3.7383

Residual                                 43.5428

sbxkoenk
SAS Super FREQ

How do I compare mixed models with proc glimmix?
https://communities.sas.com/t5/Statistical-Procedures/How-do-I-compare-mixed-models-with-proc-glimmi...

 

How to compare 2 mixed models with different Fixed Effects?
https://communities.sas.com/t5/Statistical-Procedures/How-to-compare-2-mixed-models-with-different-F...

 

Usage Note 37107: Comparing covariance structures in PROC MIXED
https://support.sas.com/kb/37/107.html

 

Koen

StatsMan
SAS Super FREQ

With the REPEATED statement, you are partitioning the R matrix into variance and covariance parameters. Using TYPE=CS, there are two components. The CS component is the common covariance between all observations on a give subject. The RESIDUAL component is the diagonal enhancement to the R matrix. Try adding the R option to the REPEATED statement and you can see what the R matrix looks like for the first subject.

 

Given that you are partitioning the variance into components with the REPEATED statement, you no longer have a single term that is the equivalent of the MSE. If your data are balanced (same number of obs on each subject and for each level of your CLASS effects), then some would advocate reporting the sum of these two components as the MSE. 

 

If the data are not balanced or if you are using a more complicated covariance structure on the REPEATED statement (some other value than CS on TYPE=), then you have partitioned the residual variance in a more complicated fashion and it may be difficult to impossible to come up with a term that corresponds to the MSE. 

SteveDenham
Jade | Level 19

This is probably more for @StatsMan , @sbxkoenk  and @jiltao  than the OP.  Could an RMSE-like value be calculated from the residuals obtained from using the OUTPRED= option from the MODEL statement? I can conceive of squaring the raw (unscaled) residuals (observed - predicted), summing them up and taking the square root.  This seems logical, but if it were, there would be a lot more of it done, so I am obviously missing something.  Anyone care to take a stab at this?

 

SteveDenham 

jiltao
SAS Super FREQ

I think your way is computing a summary statistic (variance, standard deviation) for a set of values, rather than a model-based approach. The two are not the same.

Jill

SteveDenham
Jade | Level 19

Thanks, @jiltao. That is likely the case, especially as I left out a step. After calculating the sum of squared deviations and before taking a square root, there needs to be a division by an appropriate degrees of freedom value. So we would have the deviations of the values predicted by the model from the observed values, each then squared, summed over all observations, divided by an appropriate, model-based, degrees of freedom value. That looks, at least to me, a lot like a variance due to things not in the model like unmodeled fixed or random effects. If the model was as simple as a mean, it would be the variance, wouldn't it? Taking the square root gives a standard deviation, or in the case of a general linear model, the RMSE.

 

Since the predicted values are empirical BLUPs, this calculated value is a measure of how closely the empirical BLUPs represent the variability in the raw data.  I really need my copy of Graybill's Theory and Application of the Linear Model to refresh my BLUP knowledge, but it is in a box in the basement somewhere...

 

SteveDenham

 

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 1208 views
  • 4 likes
  • 5 in conversation