Re: 2 repeated variables in Proc Mixed

KellyLRonald · Posted 05-25-2018 02:20 PM

Hello--I have an experimental design where animals are exposed to different trial (first repeated factor, i.e., 'trial_type') and then within each there is a before and after (second repeated factor, i.e., 'lights'). I'm wondering the most appropriate way to code this and decided to try using two random statements? But it is possible to also do two repeated statements? Originally I was just using the repeated statement that I have crossed out here before I realized that my design needed to include before/after component of the variable called "lights".

proc sort data=New3; by Male Trial_Day;
proc mixed data=New3; class Male Trial_Type Lights ;
model LogUSVS = Trial_Type Lights Lights*Trial_Type
/solution outp=preds ddfm=kr;
lsmeans Trial_Type Lights*Trial_Type Lights / diff ;
random Male Trial_Type(Male) Lights(Male)/V Vcorr; run;

*/repeated /type=ar(1) subject=Male; title 'USV'; run;*/

sld · Posted 05-26-2018 11:08 PM

l'm assuming here that you have animals (identified by "Male"). Each animal is exposed to some number (2?) of different "Trial_Type"s. And within each Trial_Type, there is one measure for each of two "Lights" levels. LogUSVS is assumed to be normally distributed, conditional on the fixed and random effects in the model.

If these assumptions are correct, then I would consider this code as a starting point. I'm assuming that there are only 2 levels of Lights and only 2 levels of Trial_Types. I'm using GLIMMIX rather than MIXED because GLIMMIX does nearly everything that MIXED does, and GLIMMIX offers many more useful bells and whistles.

proc glimmix data=have
  plots=(studentpanel boxplot(fixed student);
  class male trial_type lights;
  model LogUSVS = trial_type | lights / ddrm=kr2;
  random male
         male*trial_type;
  lsmeans trial_type lights;
  lsmeans trial_type*lights / plot=meanplot(join cl sliceby=lights);
  output out=glmm_out 
  run;

When you move into mixed models, you are rapidly wading into the deep end of the pool, and you will want to know more than you currently know. An excellent resource that focuses on Proc MIXED (not so much on GLIMMIX because the book is rather old, but still extremely useful) is

SAS® for Mixed Models, Second Edition

For a GLIMMIX perspective, see

Generalized Linear Mixed Models: Modern Concepts, Methods and Applications

particularly Chapters 2, 7, and 8.

I hope this helps.

KellyLRonald · Posted 05-27-2018 02:32 PM

Hello,

Thank you for the response! I will certainly check out the recommended book. I was wondering if my original code incorrect? Can you explain why? I was using the code I found in this article online:

Paper 188-29 Repeated Measures Modeling With PROC MIXED E. Barry Moser, Louisiana State University, Baton Rouge, LA

I attempted to run the proc Glimmix code you suggested but the log is telling me I'm missing a parentheses; I attempted to add one in the plots statement: but it didn't work. Then I attempted to remove this line completely and it also didn't run. I'm wondering why you did not include "lights" within the random statement here?

The assumptions you made are correct, except that I have more than 2 levels of "trial_type" (there are actually 5). Each male was exposed to each trial type twice and within each trial type there are two measurements taken--one before and one before the variable "lights" so that there are two levels of lights.

Thank you so much.

sld · Posted 05-27-2018 08:00 PM

Oh, yes, that is an excellent paper. SO much great information, and definitely not out-dated (given that the paper is from 2004). Can you imagine trying to take all that in during a typical conference presentation?!

Taking the easy question first: the second line of code needs a closing parenthesis:

  plots=(studentpanel boxplot(fixed student));

Whether you include MALE*LIGHTS in the RANDOM statement depends largely upon the experimental design. If the design is a strip-plot, then the term is often included; if the design is a split-plot, the term is often omitted.

But first: You say TRIAL_TYPE has 5 levels with two implementations of each. Perhaps there are 10 trials, for the 10 combinations of TRIAL_TYPE and LIGHTS? I see TRIAL_DAY in the SORT procedure, and perhaps that indexes the 10 trials? If so, how are the 10 TRIAL_TYPE x LIGHTS combinations assigned to the 10 TRIAL_DAYs? Do you think response in later trials is affected by which treatments occurred prior (i.e., do you think there is any carryover, or learning)? The (random?) assignment, and any constraints on that (random?) assignment, would inform how you would model possible variance and autocorrelation among levels of TRIAL_TYPE and/or LIGHTS.

When you use the REPEATED statement in MIXED, in many cases it's safer to specify the <repeated-effect> than to omit it; see the documentation. Your code might work correctly given that you pre-sort by TRIAL_DAY within MALE, but that might give you 10 repeated measurements for each level of MALE, and that's not compatible with including MALE*TRIAL_TYPE or MALE*LIGHTS in the RANDOM statement.

We would need more detail about the experimental design to move forward.

KellyLRonald · Posted 05-29-2018 11:40 AM

Thank you again for your help; I've found the book and I the Proc Glimmex code is now working-I appreciate it!

I should say that the "lights" variable is much like a pre/post test: All animals were measured before and after the the lights were turned on within an experiment where one trial_type (of the 5) was administered. All animals were exposed to each of the trial types and this entire experiment was duplicated. The research question I'm interested in asking is whether the change in outcome from pre- to post-differs between treatment groups (thus the interaction between lights*trial_type is what I am looking at). I believe this is more similar to a split-block design but this is my first time using these terms so you can feel free to correct me.

Thus, my file has 20 lines for each animal: 5 trial types * 2 lights (before and after measurements) * 2 (duplicates). The sort procedure was used to sort my data by trial_day as I think the animals may grow habituated to the experiment (one trial_type was run per day) and I do these sortings to make sure the repeated statement with an type=ar(1) runs correctly. So yes, I do think responses in later trials might be affected by which treatments occurred; to help alleviate this affect the trial_types were randomly assigned with 1 replacement over the 10 trial days. I also include trial_day as a covariate in my model (although it's not shown here in this code).

Do you think this code still needs a repeated statement? I thought (According to the paper I linked to earlier) that the using the random statements [male, lights(male) trial_type(male)] replaced the need for using a repeated statement?

sld · Posted 05-29-2018 06:30 PM

Each animal had 10 trials. A measurement was made at the beginning (pre-treatment) and end (post-treatment) for each trial, producing 20 values for each animal.

Did you run the full set of 5 treatments in the first 5 trials, and then a second set of treatments in the second 5 trials? If so, was treatment order the same for both periods for a given animal, or did each period have a different order of treatments? Or were 2 replicates of each of 5 treatments randomly assigned to 10 trials without any constraint?

However treatment order was imposed, did it vary among animals?

How many animals do you have?

What do you mean by "trial_types were randomly assigned with 1 replacement over the 10 trial days"?

I would say that this is a form of crossover design, with 5-treatments in 10-periods and no a priori planned sequences. There is a lot of literature on crossover designs (and Latin squares, which are a special case). This link is a good quick introduction

https://newonlinecourses.science.psu.edu/stat509/node/123/

In particular, note that a disadvantge of crossover designs is carryover, which is a lingering effect of a previous treatment on the response to the current treatment. (Carryover is not the same thing as autocorrelated response; carryover is a treatment-specific effect.) You have said that you think carryover is a possibility. One solution to the problem of carryover is to use an experimental design in which treatment and carryover effects can be separately estimated (i.e., they are not confounded, or aliased). For an example, see Proc PLAN Example 65.7 Crossover Designs. Your design is already established with treatments randomly assigned to periods in some fashion, and it would be lucky indeed if it was efficient with respect to carryover.

The REPEATED statement in MIXED (or RANDOM / RESIDUAL in GLIMMIX) specifies the structure of the R covariance matrix; RANDOM statements specify the structure of the G covariance matrix. Whether you need a REPEATED statement or one or more RANDOM statements depends on how the experimental design is implemented in the statistical model and reflects your hypotheses about covariance/correlation among observations. I'm not clear on an appropriate statistical model yet, so I cannot say whether REPEATED is needed.

One way to simplify the model is to eliminate the LIGHTS factor by combining the pre and post measures into a single response, for example (post-pre), or (post/pre) or [log(post/pre)]. The latter is algebraically equivalent to [log(post) - log(pre)] which is what an ANOVA with LIGHTS and a log-transformed response is doing behind the scenes for the LIGHTS effect.

KellyLRonald · Posted 05-30-2018 07:10 AM

Yes-each animal had 10 trials, a pre and a post treatment measured for each trial, producing 20 values for each animal, that is correct.

The two replicates of the 5 treatments were randomly assigned to the ten trials without any constraint. This is what I mean when I say "trial_types were randomly assigned over the trials with one replacement"

Treatment order also varied among animals and was also randomly assigned.

There were 9 total animals

I agree that this is a cross-over design and would like to explore the possibility of a carry-over effect...moving forward into more experiments I'm thankful for this resource!

sld · Posted 05-30-2018 04:13 PM

I'm moving into speculative mode here....

With only 9 subjects and random (i.e., unplanned) treatment sequences, I doubt that you will be able to address carryover in this study. But you certainly could plan for carryover in future experiments, while keeping in mind one of my favorite quotes from SAS-L by Ronald Crosier: "No one should ever do an experiment without analyzing it first." Crossover designs are complex, and of course we are always dealing with logistical constraints (like number of subjects). I would seek help from someone at your institution with experimental design expertise (someone on the Stat or Biostat or medical school faculty who teaches experimental design, for example).

I can think of many different models, none of which is perfect and none of which I can vouch for. Here's one example that might address something vaguely like "pseudo-carryover" (as distinct from carryover as defined for crossover designs), based on comparing the effect of a given treatment applied the first time to its effect applied the second time. The problem of course is that there are multiple possible explanations if the order effect is signficant (true carryover and/or learning and/or acclimation, etc.). (Did I say I won't vouch for this model? Yes, I did. Caveat emptor.)

proc glimmix data=new;
  class trial_day trial_type order;
  model log_ratio = trial_type order trial_type*order / ddfm=kr2;
  random trial_day / subject=male type=<whatever> residual;
  lsmeans trial_type / pdiff adjust=<something>;
  lsmeans order;
  lsmeans trial_type*order / plot=meanplot(join cl sliceby=trial_type);
  run;

where the new dataset has 90 observations (9 males x 10 trial_days), log_ratio computed as log(post/pre) for each trial_day within each male; and order with two levels (first application of treatment, second application of treatment). <whatever> might be cs or ar(1), etc. <something> might be tukey or simulate, etc. The response log_ratio invokes a "change from baseline" approach and allows a simpler model (we get rid of lights as a factor); there are multiple ways to quantify change in the literature and log_ratio might not be your best choice. Note that trial_type x order has 10 combinations, and so it is confounded with trial_day; the model uses trial_day in the RANDOM statement to index the repeated measures in time but not as a fixed effects factor that determines the mean of the response; in this sense, this model is not like a typical crossover model because it does not have "period" explicitly as a fixed effect (which does serve to highlight that a period effect is confounded with effects due to trial_type and/or order effects).

Good luck!

Catch up on SAS Innovate 2026