Statistical Procedures

muhlig · Posted 11-25-2022 11:27 AM

Dear Community,

Do you know of any option to do a non-parametric kind of 2way Anova/mixed model analysis with the option for multiple comparisons and an adequate adjustment?

I have data of a rather dichotomous character which can’t be transformed into a Gaussian distribution and would therefore violate the assumptions of a regular ANOVA/mixed model analysis. I have measured Bilirubin at 5 Endpoints (28d, 31d, 35d, 42d and 56d) on two treatment levels (control, treatment). Its not a repeated measurements design but different subjects at each endpoint respectively. You will find a quick plot of the data attached to get a better understanding how the data are structured, as well as a QQ Plot and the studentized residuals.

Thank you very much!

best Moritz

normal-quantile plot normal-quantile plot bili vs. endpoint by group bili vs. endpoint by group studentized residuals studentized residuals

PaigeMiller · Posted 11-25-2022 11:52 AM

I have data of a rather dichotomous character which can’t be transformed into a Gaussian distribution and would therefore violate the assumptions of a regular ANOVA/mixed model analysis.

I assume these dichotomous character variables are predictor variables. ANOVA does not require predictor variables to have a Gaussian distribution. No transformation is needed. Everything you are trying to do can be done in ANOVA without violating assumptions.

proc glm data=have;
    class endpoint treatment;
    model billirubin = endpoint treatment;
run; 
quit;

--
Paige Miller

muhlig · Posted 11-25-2022 01:04 PM

Dear Paige,

thank you very much for your reply!

Bilirubin was measured in blood-samples at the respective endpoints and should be a continuous response variable and not a predictor variable, it just shows a rather dichotomous distribution, because the values are either very high in response to the treatment or very low in the control group. And the distribution of the residuals of the response variable must be specified, doesn't it?

Maybe my explanation above was not specific enough or I misunderstood something.

I measured some other values, for instance cytokines that showed a logN distribution as expected. Those were analyzed using PROC GLIMMIX as shown below. I am looking for an alternativ if the distribution is unknown.

best Moritz

proc glimmix data=my_data;
Class group Endpoint;
Model measured_value = Group Endpoint Group*Endpoint /dist=logN
									ddfm=kr2;
Random _resid_ /group=Group;
covtest homogeneity;
output out=resids resid=r;

LSMeans Group*Endpoint/ slicediff=Endpoint
		adjust=sim
		stepdown(type=logical)
		adjdfe=row lines
		plot=meanplot(sliceby=group join cl);
	
run;

Proc univariate data=resids normaltest;
var r;
qqplot;
run;

PaigeMiller · Posted 11-25-2022 02:08 PM

Response variables do not have to have a Gaussian distribution either. The errors have to be Gaussian. You will notice that in the code I provided above, billirubin was the response variable.

See https://blogs.sas.com/content/iml/2018/08/27/on-the-assumptions-and-misconceptions-of-linear-regress...

The fact that the values of Y vary greatly based upon treatment or control group can be accounted for in the model by including treatment into the model.

--
Paige Miller

muhlig · Posted 11-25-2022 05:05 PM

Hey Paige, thanks again for helping me to sort this out.

As stated in the article you need to check the residuals for approximate normality. And that is what I did. Please forgive me the sloppy simplification to state that my data are not gaussian. But to demonstrate the non-normality of the residuals I provided a residual normal quantile plot. To my mind it looks like a systematic deviation and not a random scatter, you will find the tests for normality of the residuals attached.

Of course treatment was included into the model in terms of the group variable (the groups are 'control' and 'treatment').

best Moritz

Bildschirmfoto 2022-11-25 um 22.58.20.png

PaigeMiller · Posted 11-25-2022 06:28 PM

That's good, we now see that the residuals are not normally distributed. Once again, @Rick_SAS has the explanation: https://blogs.sas.com/content/iml/2022/08/17/box-cox-regression.html

--
Paige Miller

muhlig · Posted 11-26-2022 11:18 AM

Its not that I haven't tried box-cox yet, but I like the concept of using box-cox to get an idea of the next best distribution. After using the suggested code as displayed:

proc sql noprint;                              
 select 1-min(bili) into :c trimmed from my_data;
quit;
%put &=c;
 
proc transreg data=my_data ss2 details plots=(boxcox);
   model BoxCox(bili / parameter=&c geometricmean 
                         convenient lambda=-2 to 2 by 0.05) = identity( group | Endpoint);
   output out=TransOut residual;
run;

proc univariate data=TransOut(keep=Rbili);
   histogram Rbili / normal kernel;
   qqplot Rbili / normal(mu=est sigma=est) grid;
   ods select histogram qqplot GoodnessOfFit Moments;
run;

I got the following results which still do not look 'normal enough' do they?

Just to get this right, if the box-cox CONVIENIENT option would have suggested for instance a Lamda=0 and the residuals would have been approximately normal, lets say everything p>0.01, I could have used the parametric approach with dist=logN?

Bildschirmfoto 2022-11-26 um 17.05.29.png

PaigeMiller · Posted 11-26-2022 11:58 AM

It may be that there are no obvious transformations that turn your data into something that has close to normal errors.

--
Paige Miller

PaigeMiller · Posted 11-26-2022 03:27 PM

@PaigeMiller wrote:

It may be that there are no obvious transformations that turn your data into something that has close to normal errors.

Seems that a sentence of mine never made it ... I want to add that if you can determine what distribution is a better fit than the normal distribution, you can use PROC GLIMMIX (if that distribution is available in PROC GLIMMIX).

--
Paige Miller

muhlig · Posted 11-26-2022 05:07 PM

So which sentence are you referring to?

And if there is no transformation to turn my data into something with close to normal errors, aren’t we eventually back at my initial question if there is any non-parametric alternative?

PaigeMiller · Posted 11-26-2022 05:08 PM

I added the sentence ... in my previous reply.

So here is some code for non-parametric two-way ANOVA.

https://support.sas.com/documentation/onlinedoc/stat/ex_code/121/friedman.html

--
Paige Miller

muhlig · Posted 11-28-2022 10:04 AM

Thanks for your effort, Paige!

Ksharp · Posted 11-25-2022 11:10 PM

" a non-parametric kind of 2way Anova"
Maybe you could make a new variable like:
new=catx('|',sex,age);
And using one way non-parameter
proc npar1way wilcoxon ;
class new;
var .....
run;

muhlig · Posted 11-26-2022 01:56 PM

Hey ksharp,

Thanks for helping me! Already thought about that, but hoped that there is a more elegant way.

Best Moritz

Statistical Procedures

non-parametrical 2 way Anova/mixed model with multiple comparisons

Re: non-parametrical 2 way Anova/mixed model with multiple comparisons

Re: non-parametrical 2 way Anova/mixed model with multiple comparisons

Re: non-parametrical 2 way Anova/mixed model with multiple comparisons

Re: non-parametrical 2 way Anova/mixed model with multiple comparisons

Re: non-parametrical 2 way Anova/mixed model with multiple comparisons

Re: non-parametrical 2 way Anova/mixed model with multiple comparisons

Re: non-parametrical 2 way Anova/mixed model with multiple comparisons

Re: non-parametrical 2 way Anova/mixed model with multiple comparisons

Re: non-parametrical 2 way Anova/mixed model with multiple comparisons

Re: non-parametrical 2 way Anova/mixed model with multiple comparisons

Re: non-parametrical 2 way Anova/mixed model with multiple comparisons

Re: non-parametrical 2 way Anova/mixed model with multiple comparisons

Re: non-parametrical 2 way Anova/mixed model with multiple comparisons

Follow Us

What is...

Statistical Procedures

Our biggest data and AI event of the year.

Follow Us

What is...