turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Evaluating ANOVA assumptions using SAS

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-21-2016 06:58 AM

Hi all,

I'm constructing a baseline table in which I want to see whether certain baseline characteristics (e.g. age) are different across categories (n=4) of a certain exposure by doing ANOVA analysis. I'm using quartiles of the exposure category (independent variable), so my design is balanced.

Now, I first want to check my assumptions. Please see the syntax I used:

**PROC** **UNIVARIATE** DATA=my.data NORMAL PLOT;

VAR X Y Z;

QQplot X Y Z;

BY quartiles_exposure;

**RUN**;

Since I have a large dataset (n>4000), I both look at the QQplots (and histograms), and the Kolmogorov-Smirnov test.

However, even for the variables that look normally distributed visually, the p-value of the KS says <0.0100, constantly, indicating a departure from normality. How is this possible?

Furthermore: how am I supposed to test equal variances? Am I maybe using the wrong procedure?

Anyone willing to help me out: thanks a lot!

Best regards,

Marjolein

Accepted Solutions

Solution

11-04-2016
06:20 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-02-2016 01:43 PM

Nearly every test for normality is susceptible to finding that the distribution is "not normal" once the sample size is large enough. Random variation will guarantee that. As a result, the QQ plot is far better in determining if assumptions are met. Also, remember that the assumption of normality in ANOVA applies to the **residuals** and not the variables themselves, so be sure what you use as input into PROC UNIVARIATE are the residuals from your ANOVA. Finally, recall that ANOVA is robust to most assumptions, especially with large samples, so minor deviations from normality or homoskedasticity will not greatly influence the outcome.

Steve Denham

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-21-2016 11:35 PM

Yes. Use HOVTEST.

```
proc glm data=sashelp.class;
class sex;
model weight=sex;
means sex/hovtest;
run;
```

Solution

11-04-2016
06:20 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-02-2016 01:43 PM

Nearly every test for normality is susceptible to finding that the distribution is "not normal" once the sample size is large enough. Random variation will guarantee that. As a result, the QQ plot is far better in determining if assumptions are met. Also, remember that the assumption of normality in ANOVA applies to the **residuals** and not the variables themselves, so be sure what you use as input into PROC UNIVARIATE are the residuals from your ANOVA. Finally, recall that ANOVA is robust to most assumptions, especially with large samples, so minor deviations from normality or homoskedasticity will not greatly influence the outcome.

Steve Denham