BookmarkSubscribeRSS Feed
turfgrassscientist
Calcite | Level 5

Hello all:

I am seeking advice for the analysis of a field research study that used a 2 x 4 factorial plus control arrangement of treatments.

The study investigated the effects of frequency (once or biweekly) and rate (0.075, 0.15, 0.3, and 0.6 L m^-2) of a product applied to a field crop.  First I analyzed the data as a factorial (omitting the control) and used orthogonal polynomials to evaluate rate response (linear, quadratic, cubic).  Second, I analyzed the data using one-way single degree-of-freedom contrasts to compare specific treatments (e.g., control vs. all treatments).

I am curious if it is valid to use to the untreated control as a "0" level while evaluating response (linear, quadratic, cubic, and quartic) to rate (e.g. 0, 0.075, 0.15, 0.3 and 0.6) by using single degree of freedom contrast to represent the orthogonal polynomials.

I used PROC IML to generate coefficients for these levels:

0 (control)0.0750.150.30.6
Linear-0.474342-0.31623-0.158110.1581140.790569
Quadratic0.54596420.035224-0.33462-0.651640.40507
Cubic-0.4699870.4222580.514346-0.57050.10388
Quartic0.2367184-0.721430.631249-0.157810.011272

The rates (0.075, 0.15, 0.3 and 0.6) appear in each level of the frequency factor (twice per replication) and there is only one control ("0") per replication.  Is it possible to divide the above coefficients for the 4 rates (0.075, 0.15, 0.3 and 0.6) by 2 and construct contrast statements as follows:

TRMT123456789
FREQSingleBiweekly
RATEControl (0)0.0750.150.30.60.0750.150.30.6
Linear-0.474342-0.15811-0.079060.0790570.395285-0.15811-0.079060.0790570.395285
Quadratic0.54596420.017612-0.16731-0.325820.2025350.017612-0.16731-0.325820.202535
Cubic-0.4699870.2111290.257173-0.285250.051940.2111290.257173-0.285250.05194
Quartic0.2367184-0.360710.315624-0.078910.005636-0.360710.315624-0.078910.005636

I am having a very hard time finding any texts, articles, or online resources that answer my question.  I can provide further information if needed.  Your help would be greatly appreciated!

Thanks!
15 REPLIES 15
SteveDenham
Jade | Level 19

If you are using mixed model methods, it shouldn't be a problem that the control only appears once per replication, and the other application rates appear twice--you just will have a greater error around your estimate of the control.  The coefficients that you have for the contrasts make sense to me--any marginal rate estimate is the mean of the frequency estimates.

Now comes the question--how do you interpret a significant quartic effect?  Or a combination of polynomial effects?  Is your ultimate aim to build some sort of dose response model?

Steve Denham

turfgrassscientist
Calcite | Level 5

Hi Steve,

That is a good question--how do you interpret a significant quartic effect?  First, I must declare that I have a very limited understanding statistics in general and that I have just recently begun to learn about trend analysis.  But, regarding your question, I don't know how to interpret a significant quartic effect. In fact, I have noticed that agricultural researchers will often include only linear and quadratic components in their analysis because they determine that there is "no biological reason to consider higher degree components."  Yourstone (1991) mentions this topic in a NESUG proceedings article here. The author also mentions the creation of another contrast, a "lack of fit (LOF)" contrast, in which to group the higher order components.  Is this a common practice?  If so, is it something I should consider?

Additionally, I am confused about how to interpret a combination of significant polynomial effects.  I've actually been exposed to two opposing views regarding this topic: 1) significant higher order effects "trump" significant lower order effects (i.e. if both linear and quadratic effects are significant, then the quadratic effect is what should be discussed) or, in contrast, 2) ignoring combinations of a trend is a misinterpretation.  I would greatly appreciate your input on this topic.

The ultimate aim of this study was to evaluate the rate response within each frequency.  Specifically, this trial was initiated to determine the effects of the rate and frequency of applications of a product on disease severity and overall quality of a stand of turfgrass.  Nine to ten disease severity ratings were taken each season during this 2 year field study.  Response to the product changed from the beginning to the end of each year; the product slightly increased disease severity at the beginning of the season but then decreased disease severity by the end of the season.  So, I don't think that building a dose response model was our goal, rather we are interested in observing the changes in disease severity response to these factors across the individual evaluation dates during each season.  Please let me know if you agree with this logic.


Thank you very much for your help, Steve!


Best,

James

turfgrassscientist
Calcite | Level 5

Hi Steve,

I forgot to mention that contrast statements in PROC GLM were used to analyze this factorial plus control experiment similar to Pesek (1991) discussed in the SAS Global Forum Proceedings/SUGI 91.  I didn't use the the actual Macros that Pesek wrote, but I used contrasts statements identical to the ones his macros generated for a similar experiment. I have copied and pasted my program below.

Thanks again,

James

PROC GLM;

CLASS REP TRMT;

MODEL ANTH519 ANTH528 ANTH604 ANTH611 ANTH621 ANTH628 ANTH707 ANTH715 ANTH728 ANTH811 AUDPC= REP TRMT;

title "2010 Infrequent Topdressing Disease Severity:  Factorial plus control -- Orthogonal Contrasts";

/*Frequency main effect*/

CONTRAST 'FREQ' TRMT 0  1       1       1       1       -1      -1      -1      -1;

/*Rate main effect*/

CONTRAST 'RATE' TRMT 0  1       -1      0       0       1       -1      0       0,

                             TRMT 0  1       0       -1      0       1       0       -1      0, 

                             TRMT 0  1       0       0       -1      1       0       0       -1;

/*Interaction effect*/

CONTRAST 'FREQ x RATE'  TRMT 0        1       -1      0       0       -1      1       0       0,

                                           TRMT 0       1       0       -1      0       -1      0       1       0,         

                                           TRMT 0       1       0       0       -1      -1      0       0       1;

/*Control vs all topdressing*/

CONTRAST 'CTRL VS. ALL' TRMT 8  -1      -1      -1      -1      -1      -1      -1      -1;

/*Orthogonal polynomials for trend comparison of rate effect (averaged over frequencies, control included)*/

CONTRAST 'Linear Rate Effect for RATE'  TRMT   -0.47434200 -0.15811400 -0.07905700 0.07905695 0.39528470 -0.15811400 -0.07905700 0.07905695 0.39528470;;

CONTRAST 'Quadratic Rate Effect for RATE' TRMT  0.54596420 0.01761175 -0.16731150 -0.32581750 0.20253510 0.01761175 -0.16731150 -0.32581750 0.20253510;

CONTRAST 'Cubic Rate Effect for RATE' TRMT     -0.46998700 0.21112905 0.25717315 -0.28524900 0.05194000 0.21112905 0.25717315 -0.28524900 0.05194000;

CONTRAST 'Quartic Rate Effect for RATE' TRMT   0.23671840 -0.36071350 0.31562445 -0.07890600 0.00563615 -0.36071350 0.31562445 -0.07890600 0.00563615;*/

MEANS TRMT / LSD LINES;

MEANS TRMT;

RUN;

ods rtf close;

Quit;

SteveDenham
Jade | Level 19

The code looks fine, although I would shift to PROC GLIMMIX, for a couple of reasons.  First, I would view REP as a random effect in the sense that I would hope to infer across all possible REPs.  Second, this looks like a split plot design, and GLM just does not correctly do the F tests.  Third, you mention that the response variables are severity scores, which may complicate things a lot as they probably are not Gaussian.  For now, I don't want to go down that road, because it opens up a whole other set of problems, the least of which is that everything would switch to odds ratios and cumulative logits and the contrasts are difficult to interpret, etc., etc.  Just the first two though for now.

PROC GLIMMIX;

CLASS REP TRMT;

MODEL ANTH519 =  TRMT; /* Since there was not a MANOVA for all of the variables, I assume that you want "univariate" analyses for each variable.  GLIMMIX will not accept multiple dependent variables in the same manner as GLM, so I cut this down to just the first for now.  Note that TRTMT is the only effect here */

RANDOM intercept/subject=REP; /* Introduces REP as a random effect.  If you wish to include REP*TRMT as a random effect, and I think you will then this becomes RANDOM intercept TRMT/subject=REP */

title "2010 Infrequent Topdressing Disease Severity:  Factorial plus control -- Orthogonal Contrasts";

/*Frequency main effect*/

CONTRAST 'FREQ' TRMT 0  1       1       1       1       -1      -1      -1      -1;

/*Rate main effect*/

CONTRAST 'RATE' TRMT 0  1       -1      0       0       1       -1      0       0,

                             TRMT 0  1       0       -1      0       1       0       -1      0, 

                             TRMT 0  1       0       0       -1      1       0       0       -1;

/*Interaction effect*/

CONTRAST 'FREQ x RATE'  TRMT 0        1       -1      0       0       -1      1       0       0,

                                           TRMT 0       1       0       -1      0       -1      0       1       0,         

                                           TRMT 0       1       0       0       -1      -1      0       0       1;

/*Control vs all topdressing*/

CONTRAST 'CTRL VS. ALL' TRMT 8  -1      -1      -1      -1      -1      -1      -1      -1;

/*Orthogonal polynomials for trend comparison of rate effect (averaged over frequencies, control included)*/

CONTRAST 'Linear Rate Effect for RATE'  TRMT   -0.47434200 -0.15811400 -0.07905700 0.07905695 0.39528470 -0.15811400 -0.07905700 0.07905695 0.39528470;;

CONTRAST 'Quadratic Rate Effect for RATE' TRMT  0.54596420 0.01761175 -0.16731150 -0.32581750 0.20253510 0.01761175 -0.16731150 -0.32581750 0.20253510;

CONTRAST 'Cubic Rate Effect for RATE' TRMT     -0.46998700 0.21112905 0.25717315 -0.28524900 0.05194000 0.21112905 0.25717315 -0.28524900 0.05194000;

CONTRAST 'Quartic Rate Effect for RATE' TRMT   0.23671840 -0.36071350 0.31562445 -0.07890600 0.00563615 -0.36071350 0.31562445 -0.07890600 0.00563615;*/

LSMEANS TRMT / DIFF LINES;

RUN;

Now, as to the other part of interpreting polynomial contrasts: I like the idea of lack of fit after the quadratic and linear in most cases.  Higher order trends are almost always just fits to the noise, especially for smaller studies, and when you "exhaust" all of the degrees of freedom to fit all possible orders, such as in this case.  As far as interpretation, pictures beat descriptions hands down.  A graphical representation of the means will make the trend tests obvious.

Steve Denham

turfgrassscientist
Calcite | Level 5

Hi Steve:


Important pieces of info I left out previously:  1) the trial was a 2 x 4 factorial plus control arranged as a randomized complete block design with 4 reps--this was not a split plot design, and 2) disease severity was assessed as percent area infested using a line-intercept grid count method that produced 273 observations over the 6 x 6 ft. plot (experimental unit).  From my observations, most researchers in our field to publish disease severity data without being transformed; however, occasionally I see some researchers perform square root transformations on disease severity data.

Question:  do I need to include REP*TRMT as a random effect if the design is not split-plot? 

Also, I compared my outputs from GLM and GLMMIX (using REP as a random effect) and they were identical (for the first rating date "ANTH519").  Does this mean I can continue using GLM?

Thanks for your reply!

James

SteveDenham
Jade | Level 19

Two points:

I mistook the RCBD for a split plot.  In my world, everything is a split plot.  Anyway, I would still include REP*TRMT as a random effect (because of what I am going to say next.).

The disease severity almost certainly follows a beta distribution of some sort.  It is a ratio of two random variables (affected area divided by total area).  So long as there is never exactly zero or exactly one for this, a beta distribution is the most appropriate assumption, at least from first principles.  This means GLM is probably not appropriate.  It also means keeping the REP*TRMT random effect so that proper standard errors are calculated.

PROC GLIMMIX;

CLASS REP TRMT;

MODEL ANTH519 =  TRMT/dist=beta; /* Add the distributional assumption */

RANDOM intercep TRMT/subject=REP;

title "2010 Infrequent Topdressing Disease Severity:  Factorial plus control -- Orthogonal Contrasts";

/*Frequency main effect*/

CONTRAST 'FREQ' TRMT 0  1       1       1       1       -1      -1      -1      -1;

/*Rate main effect*/

CONTRAST 'RATE' TRMT 0  1       -1      0       0       1       -1      0       0,

                             TRMT 0  1       0       -1      0       1       0       -1      0, 

                             TRMT 0  1       0       0       -1      1       0       0       -1;

/*Interaction effect*/

CONTRAST 'FREQ x RATE'  TRMT 0        1       -1      0       0       -1      1       0       0,

                                           TRMT 0       1       0       -1      0       -1      0       1       0,         

                                           TRMT 0       1       0       0       -1      -1      0       0       1;

/*Control vs all topdressing*/

CONTRAST 'CTRL VS. ALL' TRMT 8  -1      -1      -1      -1      -1      -1      -1      -1;

/*Orthogonal polynomials for trend comparison of rate effect (averaged over frequencies, control included)*/

CONTRAST 'Linear Rate Effect for RATE'  TRMT   -0.47434200 -0.15811400 -0.07905700 0.07905695 0.39528470 -0.15811400 -0.07905700 0.07905695 0.39528470;;

CONTRAST 'Quadratic Rate Effect for RATE' TRMT  0.54596420 0.01761175 -0.16731150 -0.32581750 0.20253510 0.01761175 -0.16731150 -0.32581750 0.20253510;

CONTRAST 'Cubic Rate Effect for RATE' TRMT     -0.46998700 0.21112905 0.25717315 -0.28524900 0.05194000 0.21112905 0.25717315 -0.28524900 0.05194000;

CONTRAST 'Quartic Rate Effect for RATE' TRMT   0.23671840 -0.36071350 0.31562445 -0.07890600 0.00563615 -0.36071350 0.31562445 -0.07890600 0.00563615;*/

LSMEANS TRMT / DIFF LINES ilink; /* The ilink option puts the lsmeans back on the original observed scale.  WARNING: The diffs are NOT the differences between the observed scale values, but are the "back-transformed" values of the diffs on the distributional scale.  This should be OK, as that is where all of the testing goes on. */

RUN;

Hope this helps some.

Steve Denham

deniscosta
Calcite | Level 5

Dear Steven,

I am following your advices, but something is going wrong. When I run the programm one messsage say:

'Estimated G matrix is not positive definite'

if I remember well it should be a matrix positive definite.

Could you help me?

Best regards

Denis

SteveDenham
Jade | Level 19

Ordinarily, this is not a problem.  It simply means that you don't have enough data to estimate all of the random effects in your model.  For normally distributed data, just make sure you use ddfm=kenwardrogers2 (or some equivalent) to accommodate this in repeated measures models.  Otherwise, you can inspect the random effects and delete the ones with an estimated value of 0, but you don't have to do that--the standard errors and tests all accommodate an NPD G matrix.

A nonpositive definite Hessian matrix, however, is a red flag that none of the results are truly meaningful.

Steve Denham

FestusAdejoro
Obsidian | Level 7

Dear Steven,

I am a beginer with the use of SAS and particulalarly Orthogonal contrast. Your contributions above has really helped me create a contrast command for my analysis and wish to seek your opinion on it. Will this be effective to interprete the results of my data???

My experiment involve the use of 2 sources of Nitrogen at 3 levels of inclusion plus a control as diet additives to feed experimental animals.The response was simulated in the laboratory and the procedure repeated 3 times to give 3 replicates. The various responses are labeled as variables 1-21 as depicted in the SAS command i just prepared. I used Proc IML to get the coefficients for the polynomial effects of N-level. Here is what i have:

N-sources= 2

N-Levels =3

Control=1

Total treatments= 7

PROC GLM;

CLASS Treatment Level Replicate;

MODEL v1-v21= Replicate Treatment Level Treatment*Level;

/*Control vs all N-supplemented*/

CONTRAST 'CTRL Vs. All' Treatment +6  -1  -1  -1  -1  -1  -1;

/*N-Source main effect*/

CONTRAST 'Urea Vs. Nitrate' Treatment 0  1  1  1  -1  -1  -1;

/*N-Level main effect*/

CONTRAST 'Level' TRMT 0        +1        -1         0          +1        -1         0,

                             TRMT 0        +1        0          -1         +1        0          -1;

/*Interaction effect*/

CONTRAST 'Source x Level'  TRMT 0     +1        -1         0          -1         +1        0,

                                           TRMT 0      +1        0          -1         -1         0          +1;        

/*Orthogonal polynomials for trend comparison of Level effect (averaged over frequencies, control included)*/

CONTRAST 'Linear Rate Effect for Level' Treatment  -0.67082  -0.1118035  0.1118035  0.3354102  -0.1118035  0.1118035  0.3354102;

CONTRAST 'Quadratic Rate Effect for Level' Treatment 0.5  -0.25  -0.25  0.25  -0.25  -0.25  0.25;

CONTRAST 'Cubic Rate Effect for Level' Treatment  -0.223607  0.3354102  -0.3354102  0.1118035  0.3354102  -0.3354102  0.1118035;*/

LSMeans Treatment /DIFF lines;

  Run;


Thanks in anticipation of your kind suggestions and advise

SteveDenham
Jade | Level 19

Hello Festus,

Return to the top of the thread, and see my comments regarding the use of GLM in this context.  You should seriously consider using PROC MIXED or GLIMMIX, unless you want a 'standard' multivariate analysis of the 21 variables.  I am afraid that the tests you get from GLM are using the residual error as a denominator, and do not correctly address the nature of replicate.  If the design is a split plot, then there should be a treatment by replicate random term to get the correct test.

I suspect that the 21 variables may be either body weights taken at 21 successive intervals--is that the case?  If so, then there are other design considerations to take into account.  On the other hand, they may represent a clinical chemistry panel, and still other design and experimental objectives need to be considered before coming up with a model.  Please fill us in on these details.

Steve Denham

FestusAdejoro
Obsidian | Level 7

Dear Steven,

Thank you very much. I will apply Proc GLIMMIX as you advised. The 21 variables were not weights taken at successive periods. They include Gas production at 24 hours, Methane production, Feed digestibility, Ammonia NItrogen production etc...All the parameters are derived from the animals following the digestion of the experimental diets.

You may also help me to clarify if i will be in order to

1. Evaluate an interaction effect of N-Source and Level using Level 1 Vs Leve 2; and Level 1 Vs Level 3 (what about Level 2 Vs 3) as shown in the SAS command i prepared

     /*Interaction effect*/

         CONTRAST 'Source x Level'  TRMT 0     +1        -1         0          -1         +1        0,

                                                      TRMT 0      +1        0          -1         -1         0          +1;   

2. If i have to do the trend comparison (linear, quadratic and cubi effect of N-level) separately from the  previous contrasts because the number of orthogonal contrast alowed is just 6 based on my 2X3 factorial +1.


SteveDenham
Jade | Level 19

1.  With 3 levels, you have 2 independent comparisons, as presented in your CONTRAST statement.  If you wished to look at all 3 comparisons within Level, I would encourage you to apply some sort of adjustment for multiple comparisons to accommodate the loss of independence.

2. I am afraid I did not understand question 2, but your approach is such that there are 5 independent comparisons available to a 2x3 factorial design.  Thus, the linear, quadratic and cubic trends on N-level can be set up as independent of the comparisons involving Level.  Is that what you are looking for?

Steve Denham

deniscosta
Calcite | Level 5

Hello my research friend,

I have the same problem. Have you solved the problem? Can you give me an e-mail account to contact you?

Best regards

FestusAdejoro
Obsidian | Level 7

Dear Steven,

Your my objective aligns with your thoughts on the analysis. However, while trying to practice with 2 of the parameters (24H Gas and 24H Methane), i kept getting an Error Mesaage stating as follows:

             200

ERROR 22-322: Syntax error, expecting one of the following: (, /, =.

ERROR 200-322: The symbol is not recognized and will be ignored.

The SAS syntax and actual data input was:

Title 'Gas and methane';

Data Gasandmethane;

Label v1= Gas24, v2=Methane24;

input Treatment$ Replicate v1-v2;

cards;

Control174.112915.91843325
Control271.358517.9904604
Control362.1116115.71941815
Urea1175.490115.04878412
Urea1263.6855511.51948683
Urea1369.9813116.30495324
Urea2156.0125910.69626777
Urea2260.3409211.40405239
Urea2370.7682817.17851915
Urea3177.2607817.3410505
Urea3265.4562314.48968449
Urea3379.6216911.58406475
Nitrate1159.357217.178511302
Nitrate1267.4236514.80763354
Nitrate1356.7995610.86226791
Nitrate2145.585245.513145744
Nitrate2256.602828.706394519
Nitrate2352.864717.230013325
Nitrate3148.733122.611616701
Nitrate3246.765693.318005431
Nitrate3356.602825.138831879

;

Proc Glimmix Data = Gasandmethane;

Class Replicate Treatment;

Model v1-v2 = Treatment;/

Random intercept / Subject = Replicate;/

title 'Gas and Methane from Urea and Nitrate-Orthogonal contrasts';

/*N-Source main effect*/

Contrast 'Source' Treatment 0  1 1  1  -1  -1  -1;

/*N-Level main effect*/

Contrast 'Level' Treatment     0 +1  -1  0  +1   -1  0,

Treatment     0  +1 0  -1  +1 0  -1;

/*Interaction effect*/

Contrast 'Source x Level'  Treatment     0 +1  -1   0  -1   +1 0,

                                              Treatment     0 +1  0   -1 -1    0  +1;

/*Control vs all N-supplemented*/

Contrast 'Control Vs. All' Treatment     +6 -1  -1  -1 -1  -1  -1;

       

/*Orthogonal polynomials for trend comparison of Level effect (averaged over frequencies, control included)*/

Contrast 'Linear Effect for Level' Treatment  -0.67082 -0.1118035  0.1118035  0.3354102 -0.1118035  0.1118035  0.3354102;;

Contrast 'Quadratic Effect for Level' Treatment 0.5  -0.25  -0.25 0.25  -0.25  -0.25 0.25;

Contrast 'Cubic Effect for Level' Treatment  -0.223607 0.3354102  -0.3354102 0.1118035  0.3354102 -0.3354102  0.1118035;*/

LSMeans Treatment /DIFF lines;

  Run;


I shall be grateful for your guidance.

Thank You.

Festus

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 15 replies
  • 5635 views
  • 4 likes
  • 4 in conversation