BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
superbibi
Obsidian | Level 7

Hi Friends,

 

I am doing mutiple imputation for a repeated measurement randomised trial.I plan to ues proc mixed to do that. My question is if I want to assess the model fit, particularly  Shapiro-Wlk test for residual normality. 

 

How is the process? Should I run the proc mixed and test the normality test for the combined imputation data sets? 

 

Thank you.

 

1 ACCEPTED SOLUTION

Accepted Solutions
SAS_Rob
SAS Employee

There are no combined tests for normality for data that has already been multiply imputed mentioned in any of the literature as far as I know, so you would need to check it before if you were going to check it.  I suppose that if you are comfortable with the MCMC (again assuming that is what you are using) having converged, it would be sufficient to check the normality of the residuals from a single imputation, but again that is more of an intuition than something backed by existing theory.

View solution in original post

6 REPLIES 6
Rick_SAS
SAS Super FREQ

If you are running a procedure that supports the normality tests, you can just run the normality test for each of the imputed data sets. This will happen automatically if you are using the BY statement to analyze the imputed data.

 

However, I don't think that PROC MIXED supports an option to run a normality test. Therefore you need to output the residuals manually, You have to specify either the OUTP= option tor the OUTPM= option and include the RESIDUAL option. Here is a link to the doc.

You can then run PROC UNIVARIATE (using a BY statement) on the residuals.

 

superbibi
Obsidian | Level 7

Thank you for the response. 

 

Then what if the normality test outcome is not consistent among the imputed datasets (suppose the cutpoint is 0.01 from Shapiro-Wilk test)? Should I transform the imputed data set before running proc mixed?

 

Or, should I used the original dataset (the one before imputation) to test for normality and decide whether the data should be transformed (log or rank..)?

 

Thank you.

Rick_SAS
SAS Super FREQ

What are you trying to do? Are you concerned about the normality of residuals because you are concerned about the assumption of the linear regression model? If so, read this article about the assumptions and misconceptions of linear regression.

 

> what if the normality test outcome is not consistent among the imputed datasets?

 

I think you should do the analysis and use the PLOTS= option to create diagnostic plots. If you have a concern about the results, write back and post the results that concern you. Only by seeing the diagnostic plots can we know where to focus attention. If there is a problem, it might be that the model is misspecified, that the data are heteroscedastic, or many other issues.

SAS_Rob
SAS Employee

What method are you using for imputation?  If you are using the MCMC method then it assumes your data comes from a multivariate normal distribution which means you would want to make sure of the normality of the data prior to running MI.

 

If you get convergence in the MCMC then the data that is generated also ought to be multivariate normal so, if you assume MVN at the beginning, then you can check MVN at the end by looking at the plots MI gives for assessing convergence.

superbibi
Obsidian | Level 7
Thank you for the response. May I ask if I want to check the normality of residuals with Shapiro-Wilk test, do I need to check if before imputation or after imputation. If after imputation, how can I do it?
SAS_Rob
SAS Employee

There are no combined tests for normality for data that has already been multiply imputed mentioned in any of the literature as far as I know, so you would need to check it before if you were going to check it.  I suppose that if you are comfortable with the MCMC (again assuming that is what you are using) having converged, it would be sufficient to check the normality of the residuals from a single imputation, but again that is more of an intuition than something backed by existing theory.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 1393 views
  • 1 like
  • 3 in conversation