BookmarkSubscribeRSS Feed
fernandeze11
Calcite | Level 5

Hi all,

I am a current graduate student finishing up my master's degree, and I've run into a wall with my data analysis. The design of one of my experiments is basically hierarchical, where there are beakers nested in tanks, which are then nested in treatment. In the "Fit model" window of JMP 9, the model is entered as:

Treatment

Tank #[Treatment]& Random

Beaker[Tank #]& Random

This model then provides an output, which is causing me some confusion.

-How is the program calculating the DFDen value in the fixed effect test? What numbers is it using?

-Where can I find the p-values for the random effects (beaker and tank) as well as the fixed effect ?

On a different note, is there a way to get the script for the model in SAS form so I can see exactly what is going on when the program is running the model?

Thanks!

6 REPLIES 6
PGStats
Opal | Level 21

If Beaker is nested within Tank # which is nested within Treatment then your third effect should be Beaker[Tank #, Treatment]& Random.

PG

PG
fernandeze11
Calcite | Level 5

My data also failed the tests for normality (even after transforming the data... should've mentioned that first). As a result, I have to resort to a nested Kruskal-Wilson test... is there a way to do this in JMP, or do I have to write SAS code?

PGStats
Opal | Level 21

How did you do that test (for normality)?

PG
fernandeze11
Calcite | Level 5

I tested for normality both visually and quantitatively. I ran my data through the "distribution" module in JMP, looked at the normal quantile plots and the distribution histogram with a normal curve plotted over it. I tested the goodness of fit using the Shapiro-Wilk.

Transformations I've tried were cube root, square root, and log, as data were right-skewed. Those didn't help.

PGStats
Opal | Level 21

Don't forget what the model is: y = f(x) + e where e is distributed normally with mean zero. It is e, as estimated by the residuals in your model, that must be tested for normality, not y. Another important fact to remember is the robustness of ANOVA to non-normality. As it turns out, testing for normality is almost always useless. When you have very few observations, the test lacks power. When you have plenty of data, the smallest non-normality is detected but the central limit theorem kicks in and the ANOVA is very robust, even to large departures from normality. My prefered strategy is: inspect the residuals' distribution and worry about transformations, going non-parametric or changing model only if the distribution is severely skewed or polymodal.

PG

PG
SteveDenham
Jade | Level 19

Excellent answer--the polymodal distribution for the residuals almost surely indicate an important factor that is not being included in the model.  The other distribution problem that I get worried about is a severely platykurtotic distribution, which isn't as amenable to transformation as even a severe skew.

Besides normality testing being nearly useless, another test prior to ANOVA that I find done over and over are tests for homogeneity of variance. I offer the quote from George Box:

"To make a preliminary test on variances is rather like putting to sea

   in a row boat to find out whether conditions are sufficiently calm

   for an ocean liner to leave port!" - [Box, "Non-normality and tests

   on variances", 1953, Biometrika 40, pp. 318-335]

Almost all of the tests for homogeneity that I know about, including that in PROC GLIMMIX based on change in the log likelihood or the various HOVTEST options in PROC GLM, are sensitive to the assumption of normality for the residuals, so the whole vicious circle becomes a problem.

Some approachs I like: never trust p values completely, always treat variances as heterogeneous, learn how to do permutation tests.  Someday, I'll follow my own advice.

Steve Denham

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 2066 views
  • 0 likes
  • 3 in conversation