BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Sofie3
Fluorite | Level 6

Hi everyone,

 

I'm looking for help deciding which statistical procedure I should use to analyse the following data:

Independent variables:

- GROUP: control & intervention

- TIME: baseline & end

Dependent variable DV is continuous, with Shapiro Wilk <0.0001. 

Due to this latest test, I think I should use a nonparametric test of a two way anova, thinking about Scheirer-Ray-Hare. 

However

1) Is this test possible in SAS? And if so, can someone help me with the code and interpretation?

2) Or do I interpret the Shapiro Wilk test wrong and can I continue using the two way anova?

3) I also want to add a covariate in my analysis. However I don't know if this is possible in the suggested test (anova or Scheirer or another test I'm not thinking about).

 

Thanks in advance for your help.

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

ANOVA does not require normally distributed response variable. It requires normally distributed errors, which you can check by fitting the model and seeing if the residuals are normally distributed.

--
Paige Miller

View solution in original post

13 REPLIES 13
JosvanderVelden
SAS Super FREQ
Have you seen this post https://communities.sas.com/t5/Statistical-Procedures/Is-there-a-non-parametric-equivalent-of-a-two-...? Does that seem relevant for your case?

Best regards, Jos
Sofie3
Fluorite | Level 6
Yes I have seen it, however a Friedmans test is only possible with 1 predictor variable, so i tried the proc glimmix procedure.
However, I always reveive the same error in my log ' Invalid or missing data.'
This was the code that I used:
proc glimmix data=ild.all;
class intervention visit;
model mean_steps =intervention visit intervention*visit/dist = bin;
random intercept / subject=study_id;
run;
With 2 intervention classes, 2 visit classes, continuous dependent variable (mean_steps).

Is my code not correct or what are the reasons for this error?
I found this as well, unfortunately it did not help solving my problem: https://communities.sas.com/t5/Statistical-Procedures/Is-there-a-non-parametric-equivalent-of-a-two-....

Kind regards
Sofie

PaigeMiller
Diamond | Level 26

ANOVA does not require normally distributed response variable. It requires normally distributed errors, which you can check by fitting the model and seeing if the residuals are normally distributed.

--
Paige Miller
Sofie3
Fluorite | Level 6
Thank you PaigeMiller for the correction of my misconception.
I would like to fit the model, however with 2 independent categorical variables, it seems difficult to do?
If possible, I would like to hear how I can do it.
Thank you for your help.
PaigeMiller
Diamond | Level 26

@Sofie3 wrote:
Thank you PaigeMiller for the correction of my misconception.
I would like to fit the model, however with 2 independent categorical variables, it seems difficult to do?

Should not be difficult. All SAS modeling programs allow two independent categorical variables. You have already provided code that works.

--
Paige Miller
Sofie3
Fluorite | Level 6

 

@PaigeMiller  I used this code: 


ods graphics on;
proc reg data = ild.all; *p-waarde sign = niet lineair;
model mean_steps = intervention;
plot mean_steps*intervention;
plot r. *p.;
run;

 

Seeing the following output: 

Sofie3_0-1673439692652.png

For me, it unclear if the errors are normally distributed or not. R² = 0.0278, meaning that they are not?

And if so, then I have to look for another analytical test?

PaigeMiller
Diamond | Level 26

Not the residuals from some regression. You need the residuals from your PROC GLIMMIX with two categorical variables. You plot the distribution of the residuals, not the residuals against one of the x-variables.

--
Paige Miller
Ksharp
Super User
Since you have TIME variable,I think it is a repeated measured experiment, you should use mixed modle by PROC GLIMMIX or MIXED .
Sofie3
Fluorite | Level 6
I'm planning to use proc mixed, but I think the same assumption for normality has to be fulfilled as in a normal ANOVA.
Ksharp
Super User
Then you could try other distribution, like: GAMMA
StatsMan
SAS Super FREQ

As @PaigeMiller posted (and citing Rick's excellent blog post), the normality condition is on the residuals, not the dependent variable. You can run MIXED and check the plot of your residuals to see if they appear normal "enough" to you. 

Sofie3
Fluorite | Level 6

Thank you @StatsMan

Seeing this output, i would suggest that the residuals are 'normal', right? (I looked at the right upper graph)

Sofie3_0-1673446644858.png

 

SteveDenham
Jade | Level 19

There is a bit of a right tail, as shown in the histogram and at the upper right part of the QQ plot, but definitely not enough to make the assumption of normality of the residuals untrue. One thing I do see is that it looks like one of your model factors has three levels, rather than the two you mentioned in the original post.

 

So here is a way to analyze this using time as a repeated factor at the residual level, using PROC GLIMMIX:

 

proc glimmix data=ild.all;
class intervention visit study_id;
model mean_steps =intervention visit intervention*visit;
random visit /residual type=un subject=study_id;
run;

Adding a covariate is then easy to do by adding the variable to the MODEL statement. If the covariate is categorical, you will also need to add it to the CLASS statement.

 

SteveDenham

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 13 replies
  • 1657 views
  • 8 likes
  • 6 in conversation