BookmarkSubscribeRSS Feed
tess921
Calcite | Level 5

Hi SAS community,

I am a novice SAS user. I would like to analyze a dataset from an experiment. 

I have tree samples from two trials conducted at different sites. Each site was divided into several blocks, and tree seedlings from 4 origins were randomly planted within each black. 50+ years later, two blocks from each site were selected and 3 trees from each of the origins were harvested. Then each tree was "cut" into wood products using 3 different cutting methods (using a simulation software). 

I would like to test if there are differences in product recovery between the two sites, among the origins, and among the cutting methods.

1) I am not interested in blocks, so the block factor will be a random variable. Should I consider trees from each origin as subsampling or as another random factor, or maybe a replication?  

2) The analysis I have done so far assumed the trees were randomly select from each origin within each block. What if the trees were randomly selected not at the origin level, but at the block level to cover the full range of tree sizes within the block (tree size have huge effect on recovery)? If this were the case, should I drop origin in the model, because its levels were not considered in the sampling?

Here is what I have so far. I am not confident if this is a correct analysis and will be very appreciate if anyone could point me to a correct way to analyze the data (attached).

 

PROC GLIMMIX data=data;

class site origin block cut tree;

model recovery = site*origin*cut/ddfm=kr;

random block site*origin*cut*block;

run;

 

Thank you very much.

Tess

5 REPLIES 5
SteveDenham
Jade | Level 19

The code ought to converge, so it becomes a matter of whether or not your questions of interest can be addressed.

 

Unless you have missing cells, you will probably be happier fitting a factor model rather than a means model.  Change the '*" to '|' in your model statement. This will give F tests for main effects, two-way interactions and the three-way interaction.

 

Then you can look at block and specific block interactions as random effects in the RANDOM statement, eliminating those that have a zero variance component.

 

Alternatively, you can stay with the means model and through the use of the LSMESTIMATE statement with a JOINT option, create the tests equivalent to those in the factor model.  Harder to do in most cases.

 

SteveDenham

tess921
Calcite | Level 5

Thank you very much @SteveDenham for the quick response!

It may be a silly question, but I could find clear answer about what a factor model or means model is and their differences. Would you mind elaborating a little? 

When you say eliminating those that have a zero variance, do I have to specify the interaction terms? or by specifying  "site | origin | cut | block" as random, it will automatically eliminating them?

Thank you very much,

Tess

SteveDenham
Jade | Level 19

A factor model is of the type:

 

model Y = A B A*B C A*C B*C A*B*C which can also be written as 

model Y= A|B|C

 

A means model is a one-way analysis where

 

model Y=A*B*C

 

Comparison tests and lower order effects are then obtained from either CONTRAST or LSMESTIMATE statements.  So using a factor model, you would have something like this:

 

PROC GLIMMIX data=data;
class site origin block cut tree;
model recovery = site|origin|cut/ddfm=kr;
random block block*site block*origin block*cut block*site*origin block*site*cut block*origin*cut block*origin*site*cut;;

run;

There are 8 random effects here, and unless you have about 10^8 data points, at least one will probably be estimated to be zero.  You could test them by using the COVTEST statement with the TESTDATA option. 

Something to consider is that three-way and higher interactions for random effects are generally indistinguishable from residual error, and fitting them is probably not worthwhile without that really big data set.

 

I hope this helps some.

 

SteveDenham

 

Hawi
Calcite | Level 5

can you help me syntax for two level of organic, two level of cropping system and four rate of nitrogen?

Thanks

SteveDenham
Jade | Level 19

The syntax would be the same as for the above design, so I will repeat that code, and then insert your factors.

 

/* previous poster */
PROC GLIMMIX data=data;
class site origin block cut tree;
model recovery = site|origin|cut/ddfm=kr;
random block block*site block*origin block*cut block*site*origin block*site*cut block*origin*cut block*origin*site*cut;;

/* change variable names to organic, system, nitrogen */
PROC GLIMMIX data=data;
class  block organic system nitrogen;
model recovery = organic|system|nitrogen/ddfm=kr2;
random block block*organic block*system block*nitrogen block*organic*system block*organic*nitrogen block*organic*system*nitrogen;

Again, I want to point out that you probably do not have enough data to estimate the three-way and four-way effects, and they probably should be considered residual error in any case. To improve the ability to converge, you may then want this for the RANDOM statement:

 

random intercept organic system nitrogen/subject=block;

SteveDenham

 

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 2037 views
  • 4 likes
  • 3 in conversation