Contributor
Posts: 57

# PROC MIXED and non normal data

Hello everybody,

I have non normal data to analyse using a repeated measures model. I schedule to use PROC MIXED on rank transformed data. Is it a good methodology?

Posts: 2,655

## Re: PROC MIXED and non normal data

I have done this. Generate the ranks separately by time point. But, if you are analyzing the ranks using PROC MIXED, beware of unbalanced data.  If you have 20 observations at the first time point, and only 10 at some later time point, there is no way the mean ranks at each time could be equal.  Look up Friedman's Test on Google as an example of repeated measures on ranks.

Remember, the assumption in PROC MIXED about normality applies to the normality of the residuals, not of the data itself.  For example, if you plotted the data and saw a bi-modal distribution, you would assume that the distribution is non-normal.  Drill a little deeper, and it might be that you have an unaccounted for covariate (say gender) that leads to two peaks.  If you can identify the process that generates the errors/residuals, you could specify the distribution and use PROC GLIMMIX.  This is especially the case for errors from various exponential family distributions.

Steve Denham

Valued Guide
Posts: 684

## Re: PROC MIXED and non normal data

There are many issues to consider when analyzing factorials (including repeated measures) using ranks. By definition, the ranks will have unequal variances, and an unstructured covariance matrix.  Full details, including SAS code, can be found in the book:

Brunner, E., Domhof, S., Langer, F. 2002. Nonparametric Analysis of Longitudinal Data in Factorial Experiments. Wiley Publ.

Without taking some precautions in the analysis, you will get grossly inflated type I errors.

Posts: 2,655

## Re: PROC MIXED and non normal data

I definitely agree with this, and it is easy to see if you transform all observations to ranks.  However, with equal observations at each time point, and computing ranks separately at each time point, at least some of the concerns regarding unequal variances is addressed (i.e., for 20 observations per time point, the variance of the numbers 1 to 20 is asymptotically constant under a null of no group effect).  I'm not so sure about the covariances, though, and this gives me room to think about some of the analyses we have been doing.  Generally, autoregressive models have yielded lower information criterion values than the unstructured models, given the by-timepoint ranking.

Anyway, I am willing to wager that the OP's original question could be better answered by using GLIMMIX with a proper selection of error distribution.  I have never been a big fan of nonparametrics, going back to a FORTRAN program I wrote as an undergraduate to do Spearman correlation.

Steve Denham

Contributor
Posts: 57

## Re: PROC MIXED and non normal data

Regards,

Frequent Contributor
Posts: 140

## Re: PROC MIXED and non normal data

I agree with Steve's idea to use GLIMMIX or possibly NLMIXED, after checking out the residuals from the models generated by MIXED. GLIMMIX has a lot of distributions and with NLMIXED - well, there's almost nothing you can't do. There was an interesting paper at the most recent NESUG on fitting W shaped distributions.

N/A
Posts: 1