BookmarkSubscribeRSS Feed
jkj4
Calcite | Level 5

Hello,

 

I am working on analyzing some data using Proc Glimmix and my model was working well as evidenced by Gener. Chi-square/df ratio of less than 1 for a series of analysis when evaluating treatment, day, year, and interactions, however, when I am trying to be specific about treatment types my model is not working well.  That brief overview aside here is what I am working with.

 

I have a 4x2x2 factorial arrangement where there are 4 levels of grasses, 2 levels of brassicas, and 2 levels of legumes for a total of 16 treatments.  We measured yield at 45, 70, and 90 days after planting in two years in a repeated measure fashion (or so I believe).  There are 3 replications per treatment*day*year.  I am looking to determine if grass types are different, brassica type is different, etc as well as differences in yield based on day and year and if there are interactions.  I have been running treatment, day, year, and their interactions as fixed effects, with random effect of rep and rep within year and day.  My model in SAS that is poor was this...

 

class trt year day rep grass_type brass_type leg_type ;
model grass = grass_type day year grass_type*day grass_type*year grass_type*day*year/ddfm=kenwardroger;
random rep*day*year/subject=rep type=ar(1);

 

Please help with this.  Also, if you think that my fixed/random effects need to be different I would appreciate any additional help.

7 REPLIES 7
ballardw
Super User

Which variable(s) contains your "4 levels of grasses, 2 levels of brassicas, and 2 levels of legumes "?

If that is in a single variable then a format could be used to reduce 8 individual levels to 3. Since you don't provide any example of data at all here's a skeleton of what a format could look like:

proc format library=work;
value $planttype
'Grassname 1', 'Grassname 2', 'Grassname3', 'Grassname4' = 'Grass'
'Brassicaname 1', 'Brassicaname 2'='Brassica'
'Legumename 1', 'Legumename 2' = 'Legume'
;
run;

 

If your variable is actually numeric modify the format code to 1) remove the $ in the format name, 2) replace the name strings with the numeric code values as appropriate.

Then in your analysis code use the format (guessing that the variable grass_type has character names)

class trt year day rep grass_type brass_type leg_type ;
model grass = grass_type day year grass_type*day grass_type*year grass_type*day*year/ddfm=kenwardroger;
random rep*day*year/subject=rep type=ar(1);
format grass_type $planttype.;
jkj4
Calcite | Level 5
For each data line I have
Trt day year rep grass_type $ brass_type $ legume_type $ grass_yield brass_yield....

Under grass_type I have 4 variables
Under brass_type I have 2 variables
Under leg_type I have 2 variables

I want to know if grass yield is different based on the 4 grass types. Does this help at all?

sld
Rhodochrosite | Level 12 sld
Rhodochrosite | Level 12

I doubt that your RANDOM statement is correct, but I would need more detail about the experimental design to make any recommendations, something like a Methods section that describes the experimental protocol.

 

jkj4
Calcite | Level 5
Method:

16 treatments of three plant species were planted in 3 replication strips in a 4x2x2 factorial where there are 4 different grass species, 2 different brassica species, and 2 different legume species. 45 days after planting clippings were taken to determine total yield and each plant type component was separated to provide individual plant species yields (we have a total yield for example of 700 lb/ac with 450 lb/ac coming from grass, 225 lb/ac coming from brassicas, and 25 lb/ac coming from legume). Clippings were taken again at 70, and 90 days. This was repeated again the next year.
sld
Rhodochrosite | Level 12 sld
Rhodochrosite | Level 12

I'm assuming that you have 16 plots within each strip, yes? Are strips essentially blocks?

 

Did you use the same strips in the second year, with the same treatment assignments to the same plots?

 

jkj4
Calcite | Level 5

Yes there are 16 strips and they are each "essentially blocks".  The location in field was different for each year.

sld
Rhodochrosite | Level 12 sld
Rhodochrosite | Level 12

To clarify, you have three replicates ("strips") in one year and three different replicates in the second year, yes? If so, then year is a fixed effects factor that is assigned (although probably not randomly assigned) to replicates; consequently, year is not a repeated measures factor.

 

I'll assume that each replicate contains 16 plots, and that the 16 combinations of 4 grass species, 2 brassica species, and 2 legume species are randomly assigned to plots. I'll also assume that you measured yield on each plot three times (days 45, 70, and 90).

 

Because each level of day is measured on the same plot, day is a repeated measures factor. It is important to note that the levels of day (45, 70, 90) are not evenly spaced; this is a consideration for the choice of covariance structure type, most of which are not appropriate for repeated measures factors with unequally spaced levels. Perhaps in your experimental system, there is not much actual difference between 25 days (70 - 45) and 20 days (90 - 70); that's something you would have to decide.

 

Do you want to assess whether the brassica species or the legume species potentially affect the grass yield, in addition to a possible effect of grass species? If so, then brassica species and legume species would be included in the MODEL statement, giving you a five-factor treatment structure. A full five-way factorial (with one 5-way interaction, five 4-way interactions, ten 3-way interactions, ten 2-way interactions, and five main effects) can be difficult to interpret, and you will want to ponder how your research objectives relate to the complexity of the model.

 

An additional complication is that you have three responses: grass yield, brassica yield, and legume yield. These 3 variables are yield components measured on the same plot and will be correlated. You could start by analyzing each yield component separately, but at some point you will probably need to address the interdependence of these 3 yield metrics.

 

Are you using MIXED or GLIMMIX? With either, an excellent resource for building hierarchical models and implementing repeated measures is https://www.sas.com/store/books/categories/usage-and-reference/sas-for-mixed-models-second-edition/p... . The update to this text (SAS for Mixed Models: An Introduction by W. W. Stroup et al.) is supposed to be available this fall; I expect it will provide more details about the use of GLIMMIX than does the 2nd ed (which is largely focused on MIXED).

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 1486 views
  • 0 likes
  • 3 in conversation