BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Dennisky
Quartz | Level 8

Dear all, 

We want to conduct a two-way ANOVA analysis for our data.

However, we find the dependent variable is non-normally distributed

 

As well as known, we could conduct the Kruskal Wallis one-way test when the dependent variable is non-normally distributed in a one-way ANOVA design.

That is, is there a non-parametric equivalent of a two-way ANOVA?

Moreover, how to calculate the sample size at the situation? (A two-way ANOVA design but the dependent variable is non-normally distributed)

 

Thanks a lot !

1 ACCEPTED SOLUTION

Accepted Solutions
SteveDenham
Jade | Level 19

Friedman's occurred to me too @PaigeMiller , but two of the assumptions for Friedman's test are that there is no interaction between the two factors, and that the levels of both factors are at least uncorrelated.  However, I think this is a repeated measures design (whether with 3 timepoints or 2).  It also looks like the range of values is bounded above by 100, which implies that the response variable might be some sort of percentage-like value.  If that is the case, I would try the Stroup and Claassen approach for a GLMM:

 

proc glimmix data=yourdata_IN_LONG_FORMAT;
resp = response/100;
class group time pid;
model resp =group time group*time/dist=binom;
random time/residual subject=pid type=chol;
lsmeans group|time/diff ilink;
run;

By long format I mean that for each combination of group, time and pid there is a single record in the dataset.  To compare the "percentages" on the original scale, you'll probably need to invoke the %NLmeans macro (which in turn calls the %NLest macro).  See appropriate posts by @StatDave which point to the note where the most recent versions of these can be downloaded, and a lot of really good and helpful tips show up.

 

SteveDenham

 

View solution in original post

9 REPLIES 9
ballardw
Super User

You might describe the data and provide the questions to be answered by the analysis. Specifically asking for "ANOVA" may be restricting your options.

Providing some example data in the form of a data step is the best way to describe the data.

 

When you say " we find the dependent variable is non-normally distributed" that indicates to me that you have already collected the data. So why the question about sample size?

PaigeMiller
Diamond | Level 26

"However, we find the dependent variable is non-normally distributed"

 

ANOVA has no requirement that dependent variable be normally distributed. The requirement is that the errors are normally distributed. So based on what you have said, non-parametric tests may not be necessary.

--
Paige Miller
Rick_SAS
SAS Super FREQ

Paige is correct about the assumptions of an ANOVA. However, I suspect that you are asking about PROC NPAR1WAY, which performs tests ANOVA-like tests for the distributions of two (or more) groups. Look at this example, which uses a nonparametric analysis to analyze the response in two treatment groups. The null hypothesis is that there is no difference in the response versus the alternative hypothesis that the response differs between the groups.

 

 

Dennisky
Quartz | Level 8

Thank you! Indeed very helpful.
We have explained our study in the response.  The design of our study might not consistent with the example of the link.

Could we still using the Two-way ANOVA ?

Moreover, how to calculated the sample size?

Dennisky
Quartz | Level 8

Thank you for your valuable suggestion.

(I)   Actually, we want to compared the treatment effect between a new surgery treatment group and traditional surgery treatment group for cardiovascular disease. The treatment effect is an activity score for estimating the subjects at 3 weeks, 6 weeks and 12 weeks after surgery(see Table 1).

Table 1

Tabel 1Tabel 1

 We plan to conduct the two-way ANOVA method for analyzing the data. But we found the activity score is non-normally distributed.

So, we want to check is there a non-parametric equivalent of a two-way ANOVA?

 

(II)  Moreover, we might could conduct the generalized linear mixed models (GLMM) method to analysis the data (there are three time points).

 

(III) But if we have three treatment groups and only two time points  (see Table 2—— Group: ABC treantment groups;     Time: 3 weeks and 6 weeks),   and the dependent variable or the errors are non-normally distributed too .How to analysis the data?

Table 2Table 2Table 2

 

(IV)  In addition, we want to calculated the sample size because of the study is a explore experiment and we want to consider it as a pre-experiment. Thus, we can calculated the sample size for the further study in the future.

 

Thanks!

 

Dennisky
Quartz | Level 8

Thanks a lot for this link.
Notaly, we have five patients in the new surgery treatment group and traditional surgery group, respectively.
In the study titled 'Randomized Complete Block Design', could the "Block" be considered as our three time ponits(3, 6, 12 weeks) and "Trtment" be considered as our two treatment groups(new surgery and traditional surgery group)?
In addtion, could we conduct the analysis in our study by Scheirer–Ray–Hare technique?

(see John HM.Handbook of Biological Statistics.Sparky House Publishing Baltimore, Maryland, 2008:173, 188.)

11.png12.png

SteveDenham
Jade | Level 19

Friedman's occurred to me too @PaigeMiller , but two of the assumptions for Friedman's test are that there is no interaction between the two factors, and that the levels of both factors are at least uncorrelated.  However, I think this is a repeated measures design (whether with 3 timepoints or 2).  It also looks like the range of values is bounded above by 100, which implies that the response variable might be some sort of percentage-like value.  If that is the case, I would try the Stroup and Claassen approach for a GLMM:

 

proc glimmix data=yourdata_IN_LONG_FORMAT;
resp = response/100;
class group time pid;
model resp =group time group*time/dist=binom;
random time/residual subject=pid type=chol;
lsmeans group|time/diff ilink;
run;

By long format I mean that for each combination of group, time and pid there is a single record in the dataset.  To compare the "percentages" on the original scale, you'll probably need to invoke the %NLmeans macro (which in turn calls the %NLest macro).  See appropriate posts by @StatDave which point to the note where the most recent versions of these can be downloaded, and a lot of really good and helpful tips show up.

 

SteveDenham

 

Dennisky
Quartz | Level 8

Thank you very much!

We will coduct the analysis as your suggestion.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 9 replies
  • 10494 views
  • 3 likes
  • 5 in conversation