Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- SAS Communities Library
- /
- Checking ANOVA assumptions visually using residual plots

Options

- RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content

- Article History
- RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content

Views
6,596

ANOVA assumes that residuals (errors) are normally distributed and terms have equal variance (homoscedasticity, antonym heteroscedasticity). Professional statisticians frequently check ANOVA assumptions visually.

We bring forth a dataset that formed the basis of a paper describing *Calluna *(heath) plants’ response to Nitrogen and Drought tolerance. Nitrogen, plant source (heathland), and drought were applied in a 2*2*2 factorial. Researchers randomized plants in a greenhouse, with 10 plant pots per treatment unit (n=10), tested over two years.

This dataset holds some interesting clues about nitrogen and drought effects on heath plants. But before relying too much on the output, we should test the assumptions. How is that done visually?

```
ods graphics on;
*/The graphics statement turns on the ability to display plots*/;
proc mixed data=Heath.data Plots(only)=(studentpanel(conditional) Boxplot(conditional));
*/The plots option specifies two types of plots are output. The first the student panel, and the second are treatment-specific boxplots. Conditional option within those require calculation of residuals based on the model specification, i.e. taking into account the relationship of the treatments to one-another in the factorial design*/;
class Year Heathland Nitrogen Drought Replicate;
model 'dry weight above (g)'n= Drought Nitrogen Drought*nitrogen Heathland Heathland*Drought Heathland*Nitrogen Heathland*Drought*Nitrogen;
random 'Year'n;
RUN;
```

Studentized residuals clearly demonstrate a bimodal distribution in residual variance.

Bimodal distribution of variance

**By-Treatment Boxplots**

Let’s take a look at the boxplots to try to understand trends of unexplained variance.

Unequal variance among watering treatments

By far the widest boxplot range of residuals is from the well-watered treatment. This appears to be the culprit for the unequal variance. The data points associated with well-watered treatment skew high and low. Perhaps individual plants responded to plenty of water water either well or poorly. Next time, it might be useful to keep this in mind and capture watering response as an explanatory variable.

While the watering treatment represents a departure from equal variance, this was not the cause for the non-normal distribution. We can see this by reviewing median residual points, which are similar among the two watering treatments. The non-normality was due to another factor: notice the skew in the boxplots’ medians of year and nitrogen. Digging into the data, the results point to the two years producing different drought and nitrogen treatment effects for above ground dry weight. For this reason, it could be advisable to analyze each experiment independently by year.

Testing ANOVA assumptions need not be a checkbox exercise. The visual review of residuals allows researchers to make the most of our experiments and data models.

Comments

06-15-2020
08:47 AM

- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content

06-15-2020
08:47 AM

Very good article for beginners.

I think the first sentence has an omission. I think it should say "ANOVA assumes that residuals (errors) are independent and normally distributed and terms have equal variance (homoscedasticity, antonym heteroscedasticity)."

I would like to show this article to people at some point in time, but the graphics appear too small to really be useful. Can this be fixed by the author?

@ChrisHemedinger there is no author's name shown on this page, I believe that is also an omission.

06-15-2020
09:20 AM

- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content

06-15-2020
09:20 AM

**John Gottula****, **a SAS employee focuses on AgTech (a renewed focus area for SAS). I'll reach out to see if he has a better version of these graphics. Thanks for the comments!

06-15-2020
11:58 AM

- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content

06-15-2020
11:58 AM

The Statistical Analysis System's roots in agriculture are mostly unknown nowadays. It would be interesting to see a presentation on SAS's use in Ag now vs. then. Is there a completely different set of users, perhaps different crops or different farm sizes?

Graphics are much better now, and there's much more variety and power in modeling procedures, but I think box plots have been around for a long time.

06-26-2020
10:40 AM

- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content

06-26-2020
10:40 AM

@ChrisHemedinger your reply does not address my concern, or perhaps I didn't state it clearly enough.

There should be a by-line underneath the article title near the top of the page for these posts in the SAS Communities Library. The by-line can use the author's SAS Communities id, in this case jozgot, but it should be up there.

06-26-2020
01:06 PM

- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content

06-26-2020
01:06 PM

06-26-2020
01:13 PM

- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content

06-26-2020
01:13 PM

Yes, that would be useful. Although I don't see why you couldn't list all contributors in a by-line.

It is unnatural (and did not occur to me) to scroll down and look in the right-side column to find the name of the author. In almost every other type of publication (newspaper, magazine, blog, internet forum) the author's name is immediately under or immediately next to the title, or even in the case of the rest of SAS Communities, the author's name is directly above the article's title.

06-26-2020
04:47 PM

- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content

06-26-2020
04:47 PM

**SAS Innovate 2025** is scheduled for May 6-9 in Orlando, FL. Sign up to be **first to learn** about the agenda and registration!

Data Literacy is for **all**, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.

Article Labels

Article Tags