BookmarkSubscribeRSS Feed
GBR2015
Calcite | Level 5

Hi,


I'm trying to figure out how to use Proc MIXED  and or GLIMMIX using SAS to analyse correctly the following experiment design. The design was a SPLIT PLOT with repeated measures over DEPTH and TIME. Whole plot=fixed treatment and the Split plot=fixed treatment. Within each Split plot, samples were taken in five depths during tree times. Repetitions were RANDONIZED BLOCK.


Proc Glimmix data=data;

            CLASS Block WholePlot SplitPlot Time Depth;

            MODEL y= WholePlot | SplitPlot | Time | Depth / ddfm=kr2;                

            RANDOM intercept Block | WholePlot / subject= Block;                                         

            RANDOM Time /  ?

            RANDOM Depth / ?

            run;

           

Proc Mixed data=data;

            CLASS Block WholePlot SplitPlot Time Depth;

            MODEL y= WholePlot | SplitPlot | Time | Depth / ddfm=kr2;                

            RANDOM intercept Block | WholePlot / subject= Block;                                         

            REPEATED Time /  ?

            REPEATED Depth / ?

            run;

I hope someone can help me.

Thank you in advance !        

13 REPLIES 13
BeverlyBrown
Community Manager

Hi there, I have moved your inquiry to the SAS Statistical Procedures Community, where more experts will see it. Thank you for using SAS Online Communities!

Register now for SAS Innovate! Join your SAS user peers in Las Vegas on April 16-19 2024.

SteveDenham
Jade | Level 19

Consider:

Proc Mixed data=data;

            CLASS Block WholePlot SplitPlot Time Depth;

            MODEL y= WholePlot | SplitPlot | Time | Depth / ddfm=kr2;                

            RANDOM intercept WholePlot / subject= Block;                                         

            REPEATED Depth Time / type=un@un subject=block;

            run;

This doubly repeated measures approach with a Kronecker product for the two factors is outlined in the TYPE= option of the REPEATED statement in the documentation, where a short example of height and weight over time are given.

Steve Denham

GBR2015
Calcite | Level 5

Thank you Steve for the above answer, the SAS code worked well. 


     However, some of the data do not follow the normality of variances assumption and using proc glimmix I get a much lower AIC. Considering the above-mentioned proc mixed code, is it correct to use the proc glimmix code below to run the doubly repeated measures analysis?


Proc glimmix data=data;

           CLASS Block WholePlot SplitPlot Time Depth;

           MODEL y= WholePlot | SplitPlot | Time | Depth / ddfm=kr dist=;

           RANDOM intercept WholePlot / subject= Block;

           RANDOM Time / type=un subject=block;

           RANDOM Depth / type=un subject=block;

Thank you in advance for your attention

SteveDenham
Jade | Level 19

Hmm.  That could work, although I would make the latter two R side random statements (add the residual option).

I am going to go into uncharted territory here.  The SP(POWA) models an anisotropic power covariance structure in k dimensions.  In this case, k=2 (TIME and DEPTH),  I would rescale time and depth so that the range of values covered is similar (divide all values by the maximum observed) and then try:

Proc glimmix data=data;

           CLASS Block WholePlot SplitPlot Time Depth;

           MODEL y= WholePlot | SplitPlot | Time | Depth / ddfm=kr dist=;

           RANDOM intercept WholePlot / subject= Block;

           RANDOM _residual_ / type=sp(powa)(scaletime scaledepth) subject=block;

Calling and .  What do you guys think about this approach?

Steve Denham

         

GBR2015
Calcite | Level 5

Steve, I'm very grateful for your attention


I tried to rescale my data so as you sad, but I do not know if it is correct. Could you guide my? My original Time factors levels range from 1 to 3 and Depth factor from 1 to 5.


How should I correctly rescale them?


Thank you.

sld
Rhodochrosite | Level 12 sld
Rhodochrosite | Level 12

Steve, this is an intriguing idea. And it would address unequal time and depth intervals. Being an "empirical" programmer, I'd have to try it out on data to have anything of substance to contribute. But speculating wildly, I'd consider using

random intercept A / subject=Block;

random Depth Time / type=sp(powa)(scaletime scaledepth) subject=Block*A*B residual;


(I've switched notation, described in another response to GBR2015, but somehow I submitted it as a comment and it is "currently being moderated". Hopefully it will show up eventually.)

Or would you get the same thing with your


random _residual_ / type=sp(powa)(scaletime scaledepth) subject=Block;


Seems like this model implies a design in which depth and time levels have the same experimental unit. That's probably not entirely correct, but it might well be good enough and it cuts down a bit on the number of parameters. GBR2015 seems to be getting successful runs, so maybe his/her data can support lots of estimation. I'm envious, mine usually doesn't.


Susan

GBR2015
Calcite | Level 5

Hi Susan and Steve,

First, Susan what a astonishing desing exposition, I understand you plot/mane consideration, we need to take those points into consideration.


Yes, my depth factor are soil core samples on each b leves plot and so they are destructive samples.


The subject I have use for the random factors is ‘subject=Block*A*B’, and not only Block.

I really need to try out all option using Proc Glimmix, because of data distribution type. So, I have run using your suggested code as below:

#1 random Depth Time / type=sp(powa)(Depth Time) subject=Block*A*B residual;  (But I got this ERROR: Only one residual effect is permitted per RANDOM statement).

So, I removed the term “residual” from the end of the code #1 (I don’t know if it correct to do so?), and then it runs fine, the results differ from those using Steve suggestion (below #2).

#2 random _residual_ / type=sp(powa)(Depth Time) subject= Block*A*B;

Using Steve suggestion (#1) (considering default distribution), result are similar to those obtained from Proc mixed suggest by Steve first.


I'm still not convinced if the factor levels scale I use is correct.


Thank you,

sld
Rhodochrosite | Level 12 sld
Rhodochrosite | Level 12

Various thoughts:

1. Just to be sure: are you assessing the distributional assumptions using the residuals rather than looking at the response values prior to fitting a model?

2. When you are using the spatial covariance structures that have a c-list, then the variables in the c-list need to be on a continuous scale. You can copy Depth and Time to, say, xDepth and xTime; put Depth and Time in the CLASS statement; and use xDepth and xTime in the RANDOM statement:

random Time Depth / type=sp(powa)(xTime xDepth ) subject=Block*A*B residual;

To be safe, sort the input data to match the variable order in the RANDOM statement. I would think you would need to include "residual" but I'm not totally sure; there could still be something not specified by the model that is residual, notably Block*A*B*Time*Depth. And I don't know whether xDepth and xTime will resolve the error. You can let us know! I also don't know whether xDepth and xTime need to be scaled. Maybe Steve can elaborate.

3. Theoretically, you might assume autocorrelation associated with depth and time. But not infrequently with real data I've worked with, there's been no evidence of such: the compound symmetry or even the independence models fit as well or better than anything fancier. So if you are still experiencing computational problems and haven't tried this already, you might run a really simple model (well, relatively simple), essentially a blocked split-split-split model:

proc glimmix data=dataset;

               class A B Depth Time Block;

               model Y = A | B | Time | Depth / dist=normal;

               random intercept A / subject=Block;

               random B / subject=Block*A;

               random Time / subject=Block*A*B;

               run;

You could experiment with different distributions as needed without the additional complications of fancier covariance structures.

HTH!

Susan

Message was edited by: Susan Durham

lvm
Rhodochrosite | Level 12 lvm
Rhodochrosite | Level 12

This is a big and complex topic (doubly repeated measures). There are so many ways of tackling this, including with the approaches already listed. It is an area of active work by me, but I don't have any simple suggestions at this time. It would take too much writing.

GBR2015
Calcite | Level 5

Hi lvm

Is there any right way to know what the best model to use, for example AIC and BIC value?
Could you explain me how to rescale the two random variables
so that the range of values covered is similar, as suggested before, when using type=sp(powa)?

Thank you

SteveDenham
Jade | Level 19

One easy scaling method is to simply divide the values by the maximum value.  Suppose depth is 1 to 5, then you would have continuous values 0.2, 0.4, 0.6, 0.8 and 1.  Time is 1 to 3, so you would have 0.334, 0.667 and 1 (note the odd rounding for the first value, so that values are equally spaced.

An alternative is to scale both to the least common multiple.  For 3 and 5, this is 15, so for depth you would have 3, 6, 9, 12 and 15, and for time you would have 5, 10 and 15.  So long as the least common multiple is not some large integer, this has the advantage of equal spacing of levels without rounding.

Steve Denham

GBR2015
Calcite | Level 5

Steve,

Thank you for the rescaling explanation. I run many Glimmix models and I think I will use your suggestion (below), it gives me be lowest AIC value.

RANDOM _residual_ / type=sp(powa)(scaletime scaledepth) subject=block*A*B;


Thanks

sld
Rhodochrosite | Level 12 sld
Rhodochrosite | Level 12

I’ll start with some design clarifications. It’s extremely helpful to distinguish between design *units* (which are random effects) and design *factors* (which are fixed effects). The term “WholePlot” cannot be used in both the MODEL statement and the RANDOM statement: the WholePlotFactor (call it A) as a categorical fixed effect (at least, I’m assuming it is categorical; correct me if that’s wrong) will have some number of levels a. The WholePlotUnit will have many more levels: number of replications x number of levels of A = r x a. Similarly, the SubPlotUnit is not the same as the SubPlotFactor (call it B, with b levels).

Model Design 1: Let’s distinguish between Depth and the unit associated with levels of Depth (call it StripDSubPlotUnit, which is nested within SubPlotUnit). If you returned to the same StripDSubPlotUnit for 3 times, then you can define an additional design unit StripTSubPlotUnit (also nested within SubPlotUnit) that is associated with levels of Time. The StripTSubPlotUnits are crossed with (rather than nested within) the StripDSubPlotUnits, which is characteristic of strip-plot design. The simple model that assumes no autocorrelation in time or depth is

proc mixed data=dataset;

               class A B Depth Time Block;

               model Y = A | B | Depth | Time / ddfm=kr;

               random intercept A / subject=Block;

               random Depth / subject=Block*A*B;

               random Time / subject=Block*A*B;

               run;

Note that “Block*A*B” identifies SubPlotUnit.

To invoke some form of autocorrelation, consider a model like

proc glimmix data=dataset;

               class A B Depth Time Block;

               model Y = A | B | Depth | Time / ddfm=kr2;

               random intercept A / subject=Block;

               random Depth / subject=Block*A*B type=ar(1);

               random Time / subject=Block*A*B type=ar(1);

               run;

Keep in mind that AR(1) assumes that depth and time intervals are equal. You could try different covariance structures like UN (or CHOL) but keep in mind the number of parameters to be estimated.

Steve’s suggestion of the Kronecker product in the MIXED procedure could be tried, too. My guess at code would be

proc mixed data=dataset;

               class A B Depth Time Block;

               model Y = A | B | Depth | Time / ddfm=kr;

               random intercept A / subject=Block;

               repeated Depth Time / subject=Block*A*B type=un@un;

               run;

The big disadvantage to this model is the number of covariance parameter estimates (21, I think: 15 for the UN structure for Depth, 6 for the UN structure for Time not counting those for Block and Block*A). One would hope that a simpler covariance structure would provide an adequate fit.

A possible simplification in the covariance structure would be to regress on depth and/or time, using random coefficient models. This approach requires you to pick an appropriate model form (linear? quadratic? something else?) and it does make the fixed effects part of the model potentially more challenging.

Model Design 2: Alternatively, what if you were taking destructive samples, like soil cores, so that you measured different units associated with levels of Depth at each level of Time. Now you have design units associated with levels of Time (call them SubSubPlots, nested within SubPlot), and then you have SubSubSubPlot units associated with levels of Depth nested within SubSubPlot.  You might assume that the SubSubPlots are independent within a SubPlot. You might assume that there is autocorrelation among the SubSubSubPlot units. So maybe you would consider

proc mixed data=dataset;

               class A B Depth Time Block;

               model Y = A | B | Time | Depth / ddfm=kr;

               random intercept A / subject=Block;

               random B / subject=Block*A;

               random Time / subject=Block*A*B; /*omit this line for AR(1), retain for AR(1)+RE */

  repeated Depth / subject=Block*A*B type=ar(1);

               run;

The Kronecker product might be valid here, maybe

proc mixed data=dataset;

               class A B Depth Time Block;

               model Y = A | B | Time | Depth / ddfm=kr;

               random intercept A / subject=Block;

               repeated Time Depth / subject=Block*A*B type=un@un;

               run;

You could try a modestly simpler covariance structures like un*ar(1).

These models, some more than others, are attempting to estimate a lot of variance/covariance parameters. If you don’t have a lot of Blocks, you will have problems with estimation (like parameters set to zero, or failure to converge). Just today, one of the authors posted this link http://arxiv.org/abs/1506.04967 to a paper by Bates et al. on selecting parsimonious mixed models. I’ve only skimmed it, but it looks intriguing.

This article http://www2.sas.com/proceedings/sugi29/188-29.pdf by Barry Moser illustrates a nice progression through a series of model alternatives that I think GBR2015 would find useful.

I can’t guarantee that I’ve got the syntax right for the code above, but hopefully close enough.

Susan

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 13 replies
  • 3712 views
  • 3 likes
  • 5 in conversation