BookmarkSubscribeRSS Feed

Analyzing experimental repeats using mixed models

Started ‎05-26-2020 by
Modified ‎05-26-2020 by
Views 1,860

Most researchers struggle to combine dissimilar experiments.  On the other hand, combining results of similar studies for the sake of statistical analysis can be simple.  Mixed models can be a great option to combine similar data sets (experimental repeats) effectively.

 

Mixed models are ideal to combine experiments with blocking factors.  For example, our experiment could have blocks in space or in time.  Blocks are sources of known or likely variation, where that variation is not of primary interest to the study.

 

For instance, let's compare a mixed model versus a traditional ANOVA for a greenhouse study. We will use a dataset generously opened to the scientific community.  It’s manuscript describes a test of Calluna (heath) plants’ response to Nitrogen and Drought tolerance. The authors repeated the study twice, over a two year period.

 

Traditional ANOVA

Based on the experiment's design, we want to test the effect Drought, Nitrogen, Heathland, and how they interact with each other.  But are we interested in the year effect (year 1 versus year 2)? If we run PROC ANOVA, we treat all variables as fixed effects:

proc ANOVA data=Heath.data;
*/Heath.data was previously uploaded as an Excel File, and imported to a SAS-readable format.*/;
class Year Heathland Nitrogen Drought Replicate;
model 'dry weight above (g)'n=
Drought
'Year'n
Nitrogen
'Year'n*nitrogen
Heathland
'Year'n*Heathland
Heathland*Nitrogen
'Year'n*Heathland*Nitrogen
Drought*'Year'n
Drought*Nitrogen
Drought*'Year'n*nitrogen
Drought*Heathland
Drought*'Year'n*Heathland
Drought*Heathland*Nitrogen
Drought*'Year'n*Heathland*Nitrogen;
*/ “model =” specifies [fixed] effects. Because the factorial design, each test endpoint is named in addition to all factorial interactions (join by asterisks, *).
Single quotes followed by “n” denote the otherwise actionable programming script “Year” is in this case a variable name present in the source data table, as is “dry weight above ground(g)’n” in the model statement*/;
RUN;

 

The output table yields "Year" among several factors highly statistically significant.  This means the experiments (years) sometimes showed different results.  Concurrently, based on year's significant interactions, we conclude that the effects of nitrogen, drought, and the combination of the two showed different outcomes among years.

Source

DF

Anova SS

Mean Square

F Value

Pr > F

Drought

1

3.001736

3.001736

16.93

<.0001

Year

1

1424.813673

1424.813673

8037.40

<.0001

Nitrogen

1

144.159422

144.159422

813.21

<.0001

Year*Nitrogen

1

115.528100

115.528100

651.70

<.0001

Heathland

1

0.571965

0.571965

3.23

0.0746

Year*Heathland

1

0.000000

0.000000

0.00

1.0000

Heathland*Nitrogen

1

1.483411

1.483411

8.37

0.0044

Year*Heathla*Nitroge

1

0.000000

0.000000

0.00

1.0000

Year*Drought

1

1.169362

1.169362

6.60

0.0112

Nitrogen*Drought

1

12.734403

12.734403

71.84

<.0001

Year*Nitroge*Drought

1

15.215862

15.215862

85.83

<.0001

Heathland*Drought

1

0.363152

0.363152

2.05

0.1545

Year*Heathla*Drought

1

0.914895

0.914895

5.16

0.0246

Heathl*Nitrog*Drough

1

0.000000

0.000000

0.00

1.0000

Year*Heat*Nitr*Droug

1

0.237733

0.237733

1.34

0.2488

 

Some would advocate to separate the two experiments and analyze them independently. This rational recommendation stems from major differences of outcomes among years (all those highly significant Year* interaction results). Researchers and audiences with a penchant for parsimony will benefit from this level of detail.

 

Mixed Model

On the other hand, what if we aren’t interested in the effect of year, and only interested in the effect of drought, nitrogen and heathland?  Let’s combine the experiments treating year as a random effect in PROC MIXED:

proc mixed data=Heath.data;
*/Everything in this code is the same, except for “ANOVA” changed to “Mixed, and year is taken from the Model Statement and placed in a new Random Statement.*/;
class Year Heathland Nitrogen Drought Replicate;
model 'dry weight above (g)'n=
Drought
Nitrogen
Drought*nitrogen
Heathland
Heathland*Drought
Heathland*Nitrogen
Heathland*Drought*Nitrogen;
random 'Year'n;
*/the random statement specifies the blocking factors, in this case the year.*/;
RUN;

 

The output culminates in Type 3 tests of fixed effects, which we interpret like the PROC ANOVA results.

Effect

Num DF

Den DF

F Value

Pr > F

Drought

1

132

3.41

0.0669

Nitrogen

1

132

127.89

<.0001

Nitrogen*Drought

1

132

13.71

0.0003

Heathland

1

132

0.29

0.5932

Heathland*Drought

1

132

0.60

0.4411

Heathland*Nitrogen

1

132

0.72

0.3961

Heathl*Nitrog*Drough

1

132

0.00

0.953

 

 

PROC MIXED benefited us because it allowed us to generalize across the two experiments.  Even though experiments differ substantially, we are still able to make broad conclusions about the things we care about.  This comes with risks, as we know the year had an effect, and that may hold the key to valuable information.  Additionally, we took the naughty tack of not testing ANOVA assumptions before sallying toward interpretation.

 

We've only scratched the surface of PROC MIXED.  SAS shines impressively in the diversity and versatility of its mixed model procedures.  See how your colleagues apply SAS mixed models in their research by searching your favorite manuscript database for "SAS," "random factor" and a relevant keyword (try "maize," or even "heath").

Version history
Last update:
‎05-26-2020 05:46 PM
Updated by:
Contributors

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags