BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
michel
Calcite | Level 5

Hi!

I have a complex repeated-measures model with multiple random effects, and I want to check to make sure I'm coding the random effects correctly.  Specifically, I'm concerned about the number of subjects shown in the Dimensions panel.

Background

Briefly, I performed two treatments (TREAT1 and TREAT2) on a sample of 60 plants (PLOT) in a split-plot design.  These treatments were kept in place over a period of 10 months (TIME), and the response variable (NUMARTHS, density of arthropods per sq. m leaf area (0-200+)) was resampled on each plant 3 times post-treatment (month 1, 4, 10).  I include the response value from day 0 (pre-treatment) as a covariate (NA0) to limit the effects of the treatments to the post-treatment values.  My fixed effects are TREAT1 and TREAT2. My repeated measure is TIME.  The fixed effects, their interaction, and their effects over time are what I am actually interested in.

I also have 2 random effects: plant species (PLANT; I used 5 spp., unequally distributed across treatments) and treatment blocks (BLOCK; N=5).  I am mostly interested in controlling for the effects of PLANT and BLOCK when analyzing to determine the strength of the fixed effects, though I do want to know the covariance parameter(s).  I want to model the random effects with both intercepts and slopes.  For example, I expect that some plant species will naturally have more or less arthropods than others (=intercept*), and that the treatment effects over time will be stronger (more +/-) on some plant species than others (=slope*).

* Question 1: Is this a correct interpretation of the role of random intercepts / slopes?

Code:

proc glimmix data=vaTOTRM_RP plots=(all) method=RSPL IC=PQ noreml plots=residualpanel order=data;

Class PLOT TREAT1 TREAT2 PLANT BLOCK ;

model NUMARTHS = TREAT1 TREAT2 TREAT1*TREAT2 TIME TREAT1*TIME TREAT2*TIME TREAT1*TREAT2*TIME na0 /  dist=poisson link=log ddfm=betwithin solution;

random TIME / residual subject=PLOT type=ante(1);

random intercept TIME / solution subject = PLANT;

random intercept TIME / solution subject = BLOCK;

output out=naapouth pred(blup)=mu resid=r;

nloptions maxiter=500 tech=newrap;

run;

Question 2: am I correctly modeling both random intercepts and slopes here?

Question 3: When I run this model without the random effects, the dimensions panel correctly shows 60 subjects with 3 observations each.  However, when I include these random effects, the dimensions panel shows 1 subject with 179 observations (there's one missing value).  I assume this is because I use different subjects (plant/block) in the random effect code?  Is this a problem, and if so how can I get around it?  I do not want to model separate random intercepts or slopes for every individual plot.

Thanks in advance for any advice you can offer!

~Nicole

1 ACCEPTED SOLUTION

Accepted Solutions
SteveDenham
Jade | Level 19

I get confused by the subject number as well, but I think we have enough data now to say that if the subjects are not nested/crossed with each other, GLIMMIX defaults to reporting them as 1.  We could throw this around some more by trying the following G-side parameterization, and using method=quad:

proc glimmix data=vaTOTRM_RP plots=(all) method=QUAD IC=PQ noreml plots=residualpanel order=data;

Class PLOT TREAT1 TREAT2 PLANT BLOCK ;

model NUMARTHS = TREAT1 TREAT2 TREAT1*TREAT2 TIME TREAT1*TIME TREAT2*TIME TREAT1*TREAT2*TIME na0 /  dist=poisson link=log ddfm=betwithin solution;

random TIME /subject=PLOT type=ante(1);

random intercept TIME / solution subject = PLANT;

random  BLOCK; /* Not specified as a subject based effect to try to get the subject number to 60 */

output out=naapouth pred(blup)=mu resid=r;

nloptions maxiter=500 tech=newrap;

run;

If this doesn't run, it should at least put something interesting in the log regarding the subject dissonance.

Now on to the other stuff:  I am not quite sure what you mean by "test" when it comes to random effects.  You could use the COVTEST statement, but somehow I don't feel that is what you are looking for.  I think you are going to have to use ESTIMATE statements to get what you want, if what you want is to test one slope against another, or one blup against another.

Now as for the bolded random statements:

random plant block / subject=plot;

This would estimate 2 variance components.  The first would be the variance due to plants represented in each plot, and the second would be due to the blocks within each plot. Subject (Blocks in V) would be 60.

random intercept months / subject=plot;

random intercept months / subject = plot*plant;

random intercept months / subject = plot*plant*block;

These represent random intercepts and slopes for various observational units.  I think Subjects (Blocks in V) would range from 60 for the first to as many combined plot/plank/block units as you have for the last, and as you mentioned, they need to be ordered in this fashion.  Only by looking at the solutions on these would it make sense to me, so I would ask for the solutions.

From there, comparisons using ESTIMATE statements would be the next step, I think.

Steve Denham

View solution in original post

6 REPLIES 6
SteveDenham
Jade | Level 19

What is the interpretation of the block by time slope?  My initial inclination is to drop the TIME variable from that random statement.

I might also consider the following:

random intercept TIME BLOCK/solution subject=PLANT ;

In this formulation, PLANT*BLOCK would be the subject level for your blup's.  Is that interpretable?  I am sure you would rather have marginal means for the slopes and intercepts, averaged over the blocks, so maybe this isn't the way to go.

I drop back to the question "How important is the BLOCK*TIME slope in interpretation and design?"  If you could get away from that, then this does become more tractable.

Steve Denham

michel
Calcite | Level 5

Hi Steve,

Thanks for your helpful reply!  Yes, I could drop the TIME*BLOCK slope.  The TIME interaction is really more important for PLANT than BLOCK.  My plant species are not distributed equally across blocks or treatments, so I don't want to use PLANT*BLOCK as my subject levels.

So this brings me back to Question #3: is it ok that the Dimensions output shows 1 subject with 179 observations, rather than 60 subjects with (2-)3 observations each?

SteveDenham
Jade | Level 19

No, it really isn't right.  What happens if you try:

proc glimmix data=vaTOTRM_RP plots=(all) method=RSPL IC=PQ noreml plots=residualpanel order=data;

Class PLOT TREAT1 TREAT2 PLANT BLOCK ;

model NUMARTHS = TREAT1 TREAT2 TREAT1*TREAT2 TIME TREAT1*TIME TREAT2*TIME TREAT1*TREAT2*TIME na0 /  dist=poisson link=log ddfm=betwithin solution;

random TIME / residual subject=PLOT type=ante(1);

random intercept TIME / solution subject = PLANT;

random  BLOCK; /* Not specified as a subject based effect to try to get the subject number to 60 */

output out=naapouth pred(blup)=mu resid=r;

nloptions maxiter=500 tech=newrap;

run;

I don't know if this will give what you need or not.  I just ran some dummy data, and it looks like the subject number when both R and G side parameters are specified is set equal to one.

On another subject: since in a Poisson distribution, the mean and variance are set equal to one another and there is a log link, it may be a good idea to try to code the repeated measures as G side, rather than R side (as a generalized linear mixed model rather than as a generalized estimating equation, see Stroup (2013) Generalized Linear Mixed Models).  Then the subject numbers would be correct.

Steve Denham

michel
Calcite | Level 5

Hi Steve,

Thanks again.  I've tried running the repeated measure as a G-side effect - this is done by removing the "residual" statement, correct?

I've also tried multiple variations of the G-side random effects, including:

- using random statements without subjects (e.g., "random BLOCK" or "random PLANT")

- putting G-side random statements with "subject='plant'" or without subject specified prior to the R-side random statement

- adding a G-side random statement with "subject='plot'", and putting it first (because I read somewhere that, if the number of subjects in the Dimensions panel = 1, the model will use the number of subjects in the first G-side random statement)

In every case, my number of subjects goes back to 1.  The only way that I can maintain the correct number of subjects is to include "subject='plot'" in every single G-side random statement.

For example, this statement produces the correct number of subjects:

random plant block / subject=plot;

But, if I understand random effects correctly, that's testing the random effect of plot across plant species and block.  Which shouldn't even be estimable because each plot includes a plant of just 1 species, in just 1 block.  Am I misunderstanding this statement?

I can also get this series of statements to produce the correct number of subjects:

random intercept months / subject=plot;

random intercept months / subject = plot*plant;

random intercept months / subject = plot*plant*block;

But again, I don't think this is testing the random effect I want to test for, namely: that PLOTs with PLANT of species X have higher/lower numbers of arthropods than PLOTs with PLANT of species Y (or Z, A, B, or C), and respond differently to TREAT1 & TREAT2 over time.

Is there any way to test the above random statement in this model?  Thanks again for your help!

SteveDenham
Jade | Level 19

I get confused by the subject number as well, but I think we have enough data now to say that if the subjects are not nested/crossed with each other, GLIMMIX defaults to reporting them as 1.  We could throw this around some more by trying the following G-side parameterization, and using method=quad:

proc glimmix data=vaTOTRM_RP plots=(all) method=QUAD IC=PQ noreml plots=residualpanel order=data;

Class PLOT TREAT1 TREAT2 PLANT BLOCK ;

model NUMARTHS = TREAT1 TREAT2 TREAT1*TREAT2 TIME TREAT1*TIME TREAT2*TIME TREAT1*TREAT2*TIME na0 /  dist=poisson link=log ddfm=betwithin solution;

random TIME /subject=PLOT type=ante(1);

random intercept TIME / solution subject = PLANT;

random  BLOCK; /* Not specified as a subject based effect to try to get the subject number to 60 */

output out=naapouth pred(blup)=mu resid=r;

nloptions maxiter=500 tech=newrap;

run;

If this doesn't run, it should at least put something interesting in the log regarding the subject dissonance.

Now on to the other stuff:  I am not quite sure what you mean by "test" when it comes to random effects.  You could use the COVTEST statement, but somehow I don't feel that is what you are looking for.  I think you are going to have to use ESTIMATE statements to get what you want, if what you want is to test one slope against another, or one blup against another.

Now as for the bolded random statements:

random plant block / subject=plot;

This would estimate 2 variance components.  The first would be the variance due to plants represented in each plot, and the second would be due to the blocks within each plot. Subject (Blocks in V) would be 60.

random intercept months / subject=plot;

random intercept months / subject = plot*plant;

random intercept months / subject = plot*plant*block;

These represent random intercepts and slopes for various observational units.  I think Subjects (Blocks in V) would range from 60 for the first to as many combined plot/plank/block units as you have for the last, and as you mentioned, they need to be ordered in this fashion.  Only by looking at the solutions on these would it make sense to me, so I would ask for the solutions.

From there, comparisons using ESTIMATE statements would be the next step, I think.

Steve Denham

michel
Calcite | Level 5

Hi Steve,

Great, I think I have it working now.  Thanks so much for your help, I really appreciate it!

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1504 views
  • 3 likes
  • 2 in conversation