Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Re: Split-plot design at random multiple locations

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 01-03-2019 04:42 PM
(2138 views)

Dear all,

I have a general question about using PROC MIXED for split-plot design at random multiple locations. Suppose A-main plot, B-sub-plot, Rep-replications, Loc-locations. If Loc is fixed, I know the code should be

proc mixed; class Loc Rep A B; model y =Loc|A|B; random Rep(Loc) A*Rep(Loc); run;

However, if Loc is a random effect, then how should I revise the code? Should I also include Loc*A Loc*B Loc*A*B in random statement? What I think is

proc mixed; class Loc Rep A B; model y = A B A*B; random Loc Rep(Loc) A*Rep(Loc);

Please correct me, thanks!!

Further more, I know mathematically Loc + Rep(Loc) is equivalent to Loc + Rep + Loc*Rep in SAS, but which one is the correct way to understand the logic of experimental design? Suppose Loc is crossed with Rep, then A*Rep(Loc) should be changed to A*Rep + A*Loc + A*Rep* Loc ? Are they equivalent?

Thanks for any input!!

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I agree that the random statements could be

```
random loc
a*loc
a*b*loc;
random rep(loc)
a*rep(loc);
```

and are generally/probably what I would use unless, as you note, I have a strip-plot element in the design or perhaps a lack of random assignment of treatment to experimental units. Still, most of the designs I work with are small samples, and estimating fewer variance components is nearly always the better route!

7 REPLIES 7

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I am assuming here that *Loc*s are random, that *Rep*s are random and nested within *Loc*s, that WholePlots are random and nested within *Rep*s, and that SubPlots are random and nested within WholePlots. The experimental unit for fixed effects factor *A* is WholePlot, and the experimental unit for fixed effects factor *B* is SubPlot.

If *Loc* is random, then generally you would think of the spatial inference space as being defined by *Loc*s. Consequently, *Rep*s within *Loc*s are subsamples. I would consider something like:

```
proc mixed;
class loc rep a b;
model y = a b a*b;
random loc
a*loc
b*loc
a*b*loc;
random rep(loc)
a*rep(loc)
b*rep(loc);
```

This leaves a*b*rep(loc) to be residual variance.

You could combine *b*loc + a*b*loc* by replacing the first RANDOM statement with

```
random loc
a*loc
a*b*loc;
```

If the second RANDOM statement generated estimation problems, you could omit it, then residual variance would be *rep(loc) + a*rep(loc) + b*rep(loc) + a*b*rep(loc)*.

If the experimental design is like I described above, then I think *Rep(Loc)* or *Rep*Loc* makes more sense than *Rep + Loc*Rep*. If each *Rep* is not uniquely identified across all *Loc*s, then you must specify either *Rep(Loc)* or *Rep*Loc*. I prefer *Rep(Loc)* because it explicitly implies nesting of *Rep* units within *Loc* units.

I hope this helps.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi sld,

Thanks very much for your well organized reply! However, based on your first part of code, I couldn't see the difference between A and B, it looks like a strip-plot design, and I also hesitate to agree to the subsamples opinion about *Reps* within *Locs., *because different reps have different experimental units*. *Within each location, Reps are independent with each other, just like subplots within each main plot are independent. Otherwise, it is no sense to test A*B interaction.

Thank you for the explanation about Rep(Loc), I agree with you that Rep(Loc) makes more sense. I also want to know when using Rep*Loc , how to modify the random statement.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I'm not sure what you mean by "I couldn't see the difference between A and B". Could it be something to do with including *b*loc* in the first RANDOM statement? I usually use the second version that pools *b*loc* with *a*b*loc*.

Regarding whether Reps are subsamples with Loc or not: See Section 6.6 (A Multilocation Example) in Littell et al., SAS for Mixed Models, 2nd ed. This example lays out the choices that must be made, with some suggested guidelines. Their general recommendation is that the location x treatment term should be retained in the model if you cannot comfortably assume that treatment effects are the same at all locations. At the time of publication (2006), they note that some assumptions depend on the specifics of the study and "are considered controversial by many statisticians". So there possibly is room for different approaches, dependent on study context.

I don't know whether this information is in the newly-released 3rd ed: SAS® for Mixed Models: Introduction and Basic Applications.

Syntax-wise, Rep*Loc expands like Rep(Loc): each covers Loc, Rep, and Rep*Loc, not including components that are otherwise specified in the model. So the two forms are interchangeable in my experience. This documentation link refers to the design matrix for nested effects in the MODEL statement, but construction of the design matrix for the RANDOM statement is largely identical.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thank you sld for the useful references, they are much appreciated, and I will read them carefully.

Sorry for the confusion of my words "I couldn't see the difference between A and B". What I mean is that the main plot factor A and sub-plot factor B are parallel or symmetric in your code, I couldn't see which one is the main plot and which one is the sub-plot based on the two random statements. I think the second random statement should be

random rep(loc) a*rep(loc);

Suppose it is a split-plot RCB, b*rep(loc) should not be there because split-plot design assume Reps are independent to the sub-plot. If my understanding is wrong, please correct me, thank you!

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I agree that the random statements could be

```
random loc
a*loc
a*b*loc;
random rep(loc)
a*rep(loc);
```

and are generally/probably what I would use unless, as you note, I have a strip-plot element in the design or perhaps a lack of random assignment of treatment to experimental units. Still, most of the designs I work with are small samples, and estimating fewer variance components is nearly always the better route!

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I'm working with agricultural science, and my samples are usually small too. Can't agree with you more that "estimating fewer variance components is nearly always the better route!" , I usually don't include a*b*loc. Thank you for all of your help!

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

You're welcome.

This is a nice paper, too, if you haven't run across it yet On recognizing the proper experimental unit in animal studies in the dairy sciences.

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. **Registration is now open through August 30th**. Visit the SAS Hackathon homepage.

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.