BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Dennisky
Quartz | Level 8

Dear all

 

We are currently conducting a study on ophthalmic surgery. One key indicator, X, is measured using an instrument to evaluate the effectiveness of the surgery. We have 30 patients, each with one eye affected and one healthy eye.

All patients underwent the same surgery to treat the affected eye. We measured the X indicator for both eyes of each patient at 1 week, 3 weeks, and 1 month after surgery.

By analyzing repeated measurements, we compared the differences in X indicators between the affected eye and the healthy eye at the three time points, respectively.

If there is a statistically significant difference in the X indicator for the affected eye but no change in the X indicator for the healthy eye, it can be concluded that the surgery is effective in treating this disease. (That is, a repeated measurement was taken for the affected eye, and it showed significant changes. Another repeated measurement was taken for the healthy eye, but there were no changes observed)

 

Is our thinking about this analysis correct?

Thanks !

1 ACCEPTED SOLUTION

Accepted Solutions
SteveDenham
Jade | Level 19

I just want to point out that the assumption of normality (or multivariate normality) refers to the residuals of a model. An easy example of this is to fit a model of body weight. If you look at the values, they are likely to be bi-modal, and thus not normal. But by including sex in the model, the residuals will approach normality.

 

So, as some one who has analyzed over five hundred different studies that have body weight as a dependent variable, and with multiple measurements over time, I can say that a repeated measures analysis of covariance, with a mixed model approach to the residuals, will certainly yield a valid analysis.

 

If your dependent variable is something else, fit the mixed model, collect the residuals and look at a QQ plot. Different distributions will give different shaped plots, but the "best" distribution will almost always yield something like a straight line.

 

SteveDenham

View solution in original post

14 REPLIES 14
Season
Lapis Lazuli | Level 10
Is the variable X categorical or continuous?
Dennisky
Quartz | Level 8

The variable X is continuous but the data may not follow a normal distribution.

Season
Lapis Lazuli | Level 10

So do you mean that the variable X does not follow a multivariate normal distribution? If that is the case, then sorry, I don't know the way of solving your problem. I previously thought that analysis of variance for repeated measures could solve your problem, but since the variable you are interested in does not follow a multivariate normal distribution, this method is not suitable.

Dennisky
Quartz | Level 8

Thanks!     

We have been conducting this study for several years. The variable has been studied in other research before. The related data sometimes conforms to normal distribution, and sometimes does not, which may be related to sampling representativeness and sample size.

However, in this study, we found that the variable did not conform to normal distribution (in fact, the data at the first two time points did not follow normal distribution, while the data at the last time point did). Nonetheless, although it does not follow normal distribution, we might still analyze it using methods such as generalized linear mixed effect models or GEE and so on.

 

What we are unsure of is whether the inference of these analytical results is accurate or not.

Specifically, we conducted a repeated measure analysis on the data from healthy eyes and diseased eyes separately. There was no difference in variable X among healthy eyes, while variable X in diseased eyes showed a difference. Based on this, we concluded that our surgery is effective.

Season
Lapis Lazuli | Level 10

There are a few issues I would like to point out:

(1) Normality test problem. 1) Difference between tests for univariate and multivariate normal distribution. You have stated that you tested normality for variable X at each time point. It should be noted that the test for multivariate normal distribution is somehow different from univariate normal distribution. More specifically, variable X at each time point following a normal distribution does not mean that variable X follows a multivariate normal distribution. A macro is now available for testing multivariate normality. But I am not that confident on your variable X following a multivariate normal distribution, as the variable X at some time points do not follow a univariate normal distribution. 2) Reliability of normality tests in large samples. According to central limit theorem, tests for univariate normality usually provide significant test results (i.e. the variable does not follow a normal distribution) given a large sample size. In my data analyzing experience, when the sample size exceeds 100, univariate normality tests tend to produce significant test results once the distribution of the sample is slightly skewed. In that case, normality results are not that reliable. You can also use the shape of histograms, P-P plots and Q-Q plots to aid your judgement. If your sample size is large and the normality test results and the results of histograms, P-P plots and Q-Q plots contradict each other, then believe in the results of histograms, P-P plots and Q-Q plots.

(2) Issues on GEE. To the best of my knowledge, the independent variables of GEE should be categorical, not continuous. Since variable X is continuous, I think that GEE is not suitable for your problem.

Dennisky
Quartz | Level 8

Thank you for your detailed explanation and patient analysis. They are essential for the correct implementation of our research. We will also evaluate the normality and feasibility of the methods used in our study.

 

However, sometimes our data does follow a normal distribution, and in those cases, we will use repeated measure methods to analyze it. Actually, the question that we are most concerned about is whether the research hypothesis itself is appropriate. (It's also important to assess whether the data conforms to a normal distribution, and we agree with your point of view.)

That is, when the data follow a multivariate normal distribution, we conducted a repeated measure analysis on the data from healthy eyes and diseased eyes separately. There was no difference in variable X among healthy eyes, while variable X in diseased eyes showed a difference. Based on this, we concluded that our surgery is effective.

Season
Lapis Lazuli | Level 10

@Dennisky wrote:

That is, when the data follow a multivariate normal distribution, we conducted a repeated measure analysis on the data from healthy eyes and diseased eyes separately. There was no difference in variable X among healthy eyes, while variable X in diseased eyes showed a difference. Based on this, we concluded that our surgery is effective.


I think that the method you wish to apply can solve the problem, namely reaching the conclusion on whether the surgery is effective.

Another issue I would like to point out is about multiple comparison. In ANOVA for repeated measures, the researcher can perform pairwise comparison of the variable between any time points. I don't know the exact way of analyzing your data, so in your data analyzing process, you should be fully aware of the issue of multiple comparisons. If the data analyzing technique only yields results like "the differences of results of variable X at different time points are statistically significant", without pointing out which of them are different (especially on whether the level of variable X at the last time point is different from that at previous time point(s)), then this is not enough.

Finally, I would like to refer to a book on analyzing data with repeated measures via SAS. You may take a look at it to find out the proper way of analyzing your data.

SteveDenham
Jade | Level 19

I just want to point out that the assumption of normality (or multivariate normality) refers to the residuals of a model. An easy example of this is to fit a model of body weight. If you look at the values, they are likely to be bi-modal, and thus not normal. But by including sex in the model, the residuals will approach normality.

 

So, as some one who has analyzed over five hundred different studies that have body weight as a dependent variable, and with multiple measurements over time, I can say that a repeated measures analysis of covariance, with a mixed model approach to the residuals, will certainly yield a valid analysis.

 

If your dependent variable is something else, fit the mixed model, collect the residuals and look at a QQ plot. Different distributions will give different shaped plots, but the "best" distribution will almost always yield something like a straight line.

 

SteveDenham

Dennisky
Quartz | Level 8

Thank you for your suggestion.

We have a deeper understanding of the normality assumption (or multivariate normality assumption) which refers to the residuals of the model. This is very important for the correct development of our research.

Can you provide an example with SAS code for what you mentioned “I can say that a repeated measures analysis of covariance, with a mixed model approach to the residuals, will certainly yield a valid analysis.”?

 

Thanks!

 

 

 

SteveDenham
Jade | Level 19

Sure. From SAS for Mixed Models, with one major change, and some minor variable and dataset name changes:

 

proc mixed data=have;
class drug patient hour;
model response= drug hour drug*hour covar;
repeated hour/subject=patient(drug) type=un r rcorr;
lsmeans drug hour drug*hour;
run;

Here covar would be the last measurement of response prior to initiating drug administration. The objective here is not to find the "best" model, which involves looking at the interaction of the covariate with drug, but rather to "artificially level the playing field", such that the lsmeans are the expected values over all subjects, if the subjects were drawn from a homogeneous population with a mean of "covar". If you don't have an appropriate covariate, then delete it from the MODEL statement.

 

SteveDenham

 

Dennisky
Quartz | Level 8
Thank you so much for the detail of code.
Season
Lapis Lazuli | Level 10

Thank you, Steve, for your valuable information! Yes, it is the residuals rathers than the dependent variables that should follow (multivariate) normal distribution(s). I was taught in class to examine the dependent variables themselves, which is incorrect.

SteveDenham
Jade | Level 19

Not necessarily incorrect. For instance, a Poisson distributed variable might be detected by looking at the mean and variance of the variable. If the two are equal, or close to equal, a Poisson might be a good starting point. Then you could examine the residuals from a model with a log link by various plots, such as QQ to see if you get a straight or almost straight line.

 

SteveDenham

Season
Lapis Lazuli | Level 10

I am afraid that you had misunderstood me. What I said was incorrect was that my teachers told us to examine the normality of the dependent variable instead of the residual of the model.

I know this is a trivial issue, but I was concerned that our conversation could serve as references for other web users who may not be necessarily that familiar with the theory of linear and mixed models and might get confused.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 14 replies
  • 1569 views
  • 3 likes
  • 3 in conversation