One way ANOVA assumptions

laurenhosking · Posted 01-04-2020 03:57 PM

I’ve done a one way ANOVA test however how would I check if my one-way ANOVA satisfies all the theoretical
assumptions which below?

key assumptions are
• The samples must be independent.
• The populations or groups from which the samples were obtained must be normally distributed and all populations must have the same variance.
• In practice this last assumption can also be checked by testing normality of the residuals and homogeneity of the variances in the different group

unison · Posted 01-04-2020 08:43 PM

No picture included.

-unison

laurenhosking · Posted 01-05-2020 07:05 AM

Thank you I edited it

Ksharp · Posted 01-05-2020 05:37 AM

x ~ iid. N( mu , sigma)

proc glm plots=all

to get the all the pictures .

Calling @Rick_SAS

PaigeMiller · Posted 01-05-2020 07:15 AM

@laurenhosking wrote:

I’ve done a one way ANOVA test however how would I check if my one-way ANOVA satisfies all the theoretical
assumptions which below?

key assumptions are
• The samples must be independent.

Generally, you would have to know (or assume) in advance that the samples are independent. I don't think its something that people normally test via analyzing the data. If you are conducting a study of random individuals, that's usually enough to assume they are independent. If the individuals are not random, for example, selecting from a single family, or are somehow blood relatives, you might conclude in advance that there might be some dependence between the individuals.

• The populations or groups from which the samples were obtained must be normally distributed and all populations must have the same variance.

As you have stated the requirement, this is not true. The errors from the fitted model must be normally distributed, not the raw data itself. This can be checked by examining the residuals. To test if each of the groups have the same variance, this can be done via the HOVTEST option of the MEANS statement in PROC GLM.

--
Paige Miller

laurenhosking · Posted 01-05-2020 07:19 AM

Thank you Do you think I can do a goodness to fit test like previous to check the second assumption

PaigeMiller · Posted 01-05-2020 07:21 AM

The second assumption is called a "compound statement" because there are two parts, and its not clear which of the two parts you talking about.

The populations or groups from which the samples were obtained must be normally distributed and all populations must have the same variance.

Are you talking about the normal distribution part, or are you talking about the same variance part?

--
Paige Miller

laurenhosking · Posted 01-05-2020 07:23 AM

I believe the normally distributed part

PaigeMiller · Posted 01-05-2020 07:45 AM

The diagnostic plots you get from PROC GLM (if that's what you are using) include a histogram of the residuals and a Q-Q plot, both of which can be used to test for normality of the residuals.

--
Paige Miller

Rick_SAS · Posted 01-05-2020 02:43 PM

I think PaigeMiller has answered your questions. I will merely add that a one-way ANOVA is equivalent to a linear regression model with a single categorical regressor. As such, you might want to review the article "On the assumptions (and misconceptions) of linear regression." That article was written for a continuous regressor, but most of the ideas are the same regardless of whether the regressor is discrete or continuous, In particular, the regression diagnostic plots (discussed in the last section of the article) can provide graphical evidence that can help you decide whether the assumptions are reasonable for your data.

PaigeMiller · Posted 01-05-2020 04:10 PM

@Rick_SAS wrote:

I think PaigeMiller has answered your questions. I will merely add that a one-way ANOVA is equivalent to a linear regression model with a single categorical regressor. As such, you might want to review the article "On the assumptions (and misconceptions) of linear regression." That article was written for a continuous regressor, but most of the ideas are the same regardless of whether the regressor is discrete or continuous, In particular, the regression diagnostic plots (discussed in the last section of the article) can provide graphical evidence that can help you decide whether the assumptions are reasonable for your data.

I'm going to add a little more comment

Iif you just want to do an ANOVA, the only part of the ANOVA that depends on normality of the errors is the F-tests performed to see if the model terms are significantly different than zero; the rest of the computations do not depend on normality. And even if you have some non-normal distribution, sometimes the central limit theorem comes into play if you have enough data, and the means estimated by the ANOVA are approximately normally distributed anyway and so the F-tests are approximately correct.

As far as independence and correlated errors go (as mentioned by @Rick_SAS), the test he links to is for one type of correlation, specifically auto-correlation, or in other words correlation over time. There are types of correlation between the subjects in the study that are not correlation over time, but which have to be assumed and I don't think you can (easily) analyze for — the one I mentioned is a biological study where subjects are related to one another rather than randomly selected. Going back to the original statement in this thread "The samples must be independent", there is not a general test for lack of independence, although there is a test for auto-correlation.

All of this may be too much for the purposes of answering the original question.

--
Paige Miller

laurenhosking · Posted 01-08-2020 07:05 AM

so I did a goodness to fit and as mentioned some of my samples arent normally distributed and some are. Would I use the F-test then to see if in general they are normally disturbed? Or is there another way

@Rick_SAS

PaigeMiller · Posted 01-08-2020 08:01 AM

@laurenhosking wrote:

so I did a goodness to fit and as mentioned some of my samples arent normally distributed and some are. Would I use the F-test then to see if in general they are normally disturbed? Or is there another way

@Rick_SAS

The raw data does not have to be normally distributed. The errors from the fitted model have to be normally distributed. You test this by examining the histograms of the residuals and the Q-Q plot of the residuals. F-tests do not test to see if the data is normally distributed.

--
Paige Miller

laurenhosking · Posted 01-08-2020 08:07 AM

Thank you so much I just fingered what I did wrong!

One way ANOVA assumptions

Re: One way ANOVA assumptions

Re: One way ANOVA assumptions

Re: One way ANOVA assumptions

Re: One way ANOVA assumptions

Re: One way ANOVA assumptions

Re: One way ANOVA assumptions

Re: One way ANOVA assumptions

Re: One way ANOVA assumptions

Re: One way ANOVA assumptions

Re: One way ANOVA assumptions

Re: One way ANOVA assumptions

Re: One way ANOVA assumptions

Re: One way ANOVA assumptions

SAS Innovate 2025: Call for Content

Classroom Training Available!