I have the following problem:
A study was conducted in 2 locations (1, 2) for 2 years (2012, 2013). In each location, we have 4 blocks . Within each block, we have split plot design with repeated measures. Main plot is gypsum treatments, sub plot is different peanut cultivar. Soil samples were taken from each sub plot at planting, mid bloom and harvest. The response is soil Ca content.
Objectives:
1. Whether there is difference between locations or years? If not, then I can pool my data through locations and years.
2. Whether there is difference between treatments?
3. Whether there is difference between sampling times?
Here is my SAS code, I don't know if it's correct. If I want to test the year effect, I have treat year as a repeated factor. But I've already had repeated measures within each block (timing), can I use two repeated statements at the same time?
proc mixed;
class year location block treatment cultivar timing;
model Ca=year| location| treatment| cultivar| timing/ddfm=kr;
random block block(location) treatment*block(location);
repeated timing/subject=block(treatment*cultivar) type=csh;
repeated year/subject=location(block*treatment*cultivar) type=csh;
lsmeans year| location| treatment| cultivar| timing/diff;
run;
Thanks!
With only two values for year, one way of addressing the doubly repeated nature would be to fit year as a G-side repeated effect; Just change the last statement from a repeated statement to a random statement.
I would prefer this to the Kronecker product method, because of the heterogeneity. If you go that way, it would be only one repeated statement:
repeated year timing / type= UN@CS subject=location(block*treatment*cultivar);
The only way to get heterogeneity of variance with this approach would be to fit UN@UN, and whether that would work depends on how many levels of timing you have, and how many total observations are available.
Steve Denham
Thank you Steve!
I am not familiar with the Kronecker product method, but I will try to figure out.
The random factors are block, block(location) and treatment*block(location), am I correct ?
Rui
I agree that those are the random effects. Check the documentation for the type= option, where a short example of height and weight as two kinds of measures and year, so that the measures are UN and year is CS or AR(1). In this case, to get heterogeneity you would have to use UN for year.
Good luck.
Steve Denham
Hi Steve,
I ran the code, but got two message:
NOTE: Estimated G matrix is not positive definite.
NOTE: Asymptotic variance matrix of covariance parameter estimates has been found to be singular and a generalized inverse was used. Covariance parameters with zero variance do not contribute to degrees of freedom computed by DDFM=KENWARDROGER.
Can I trust the results despite of these notes?
Thank you!
NOTE: Estimated G matrix is not positive definite.
NOTE: Asymptotic variance matrix of covariance parameter estimates has been found to be singular and a generalized inverse was used. Covariance parameters with zero variance do not contribute to degrees of freedom computed by DDFM=KENWARDROGER.
These notes are not show stoppers in any way. The first says that there are more random effects in the model than can be fit with the data, while the second says, OK, now let's proceed with the analysis, and correct for those components whose estimates were zero.
As far as the latter two analyses, I have to ask "Why"? Why do subset analysis in the first case, when you already have accommodated any difference in fertility due to location AND made it applicable to locations beyond the ones fit in your first analysis. Now if there is substantial heterogeneity in variance due to location, then that should be addressed, but probably not by a subset analysis.
I like the approach of treating year as a random effect, and having only timing as a repeated effect with the sp(pow) error structure. However, I would still include location as a random effect--it is a design factor, and dropping it to do subset analysis reduces power, while not including it ignores a design factor.
Steve Denham
Message was edited by: Steve Denham
Hi Steve,
1. I actually believe that there is a location effect, since location 2 has a higher fertility and regular irrigation whereas location 1 does not. Therefore I tried to test the year effect on each location. Here is my SAS code:
proc mixed;
class year block treatment cultivar timing;
model M1_Ca= year| treatment| cultivar| timing/ddfm=kr;
random intercept treatment/subject=block;
repeated year timing/subject=block(treatment*cultivar) type=un@cs;
lsmeans year| treatment| cultivar| timing/diff;
run;
I think the random factor is block and block*treatment. I am not sure if it's correct to use UN@CS covariance structure. Since the repeated factor "timing" was not equally spaced, shold I use sp(pow)?
2. I ran the above code, and found year effect is not significant. So I think I can then treat the year as a random effect and ran the test again. Here is my SAS code with year being a random effect:
Proc mixed;
class year block treatment cultivar timing;
model Ca=treatment| cultivar| timing/ddfm=kr;
random year block year*block year*treatment block*treatment year*treatment(block);
repeated timing/subject=year*block*treatment*cultivar type=sp(pow);
run;
Am I correct with the random effects and the covariance structure?
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.