BookmarkSubscribeRSS Feed
SAS-questioner
Obsidian | Level 7

I tried to conducted a repeated measure using proc mixed with below data:

ID    sex  time   outcome
1      F     1       30
1      F     2       23
2      M     1       23
2      M     2       22
3      M     1       12
3      M     2       34

The group is unbalanced, and each person was measured twice with two different time points. I could use paired t test, but I also need to compare gender, so I used the proc mixed to test the model

proc mixed data=have;
class times sex;
model outcome=sex|time/ solution CL residual outp=predresid;
repeated time/subject=id type=un;
run;

proc univariate normal plot data=predresid;
var resid;
run;

However, the residual was not normal after fitting the model. What test should I use for this kind of situation? I looked up online, someone said I should use Friedman's test, but the example code seems used 'ID' as block, and their code are pretty much like:

PROC FREQ DATA=have;
TABLES id*time*outcome / CMH2 SCORES=RANK NOPRINT;
run;

But I still have sex to be tested, can I put like id*time*sex*outcome, or there are something else that I can use? Thank you!

6 REPLIES 6
SAS-questioner
Obsidian | Level 7
Thank you for the reply! My my data is not count, maybe I can try GAMMA distribution, but will the interpretation of the result the same as normal distribution?
Ksharp
Super User
"the residual was not normal after fitting the model. "
What reason do you trust the residual after fitting model should conform normal distribution ?
I think if the model fitted properly ,the residual should look like random distribution or uniform distribution , since the effects have been absorbed by model.
SteveDenham
Jade | Level 19

Two part answer here. First a reply to @Ksharp : After fitting a model, the residuals may or may not be normal (Gaussian). For example, if you fit binomial data without accounting for the distribution with a link function, the residuals will not look Gaussian (it might take a lot of data). Second a reply to @SAS-questioner : If you only have 6 data points, why are you bothering to fit a model? The mixed model or GEE model parameters will have such large standard errors you probably won't be able to correctly infer from them. 

 

SteveDenham

SAS-questioner
Obsidian | Level 7

Thank you for the reply. My data is not just 6 data points, I just want to show the format of the data. Also the outcome itself is not normal at all, and I also tried to check the distribution of the residual, it is not normal also. If I want to use non-parametric, I don't think it can test sex at the same time, right? 

SteveDenham
Jade | Level 19

Question (actually a trick question) - how do you know that the distribution for residuals is not normal? Did you do some sort of test? There are well-known issues with almost every hypothesis test for normality (overpowered with N greater than about 40, underpowered for N less than about 15), and the linear mixed model is remarkably robust to the assumption of normality of the residuals, so long as the empirical distribution is mono-modal, not truncated, and lacks extremely large absolute values. The mono-modal basically boils down to sex differences.

 

So here are some ways to attack the issue, from simple to complex:

  1. bin your responses to four or five categories and consider using Cochran-Mantel-Haenszel methods where you stratify by sex.
  2. Plot your data and see what the shape looks like. From that, use a generalized linear model, assuming the distribution you have a picture of. If you have what might be considered random effects, use a generalized linear mixed model.
  3. Bootstrap your data. Simulate a lot of datasets that could possibly occur based on your current data.
  4. Use a Bayesian analysis with noninformative priors. This does a lot better job of simulating the data needed to construct credible intervals as you can include correlations over time or clusters. I don't think you have any random effects, so a good start on this can be found by looking through the documentation for PROC BGLIMM.

Given what you have done so far, I would recommend #4. You can use most of your PROC MIXED code, and you can examine each distribution/link to see which best fits your data.

 

SteveDenham

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 810 views
  • 2 likes
  • 3 in conversation