Programming the statistical procedures from SAS

PROC GLIMMIX code for a single factor repeated measures design with replicates is needed

Reply
Contributor
Posts: 57

PROC GLIMMIX code for a single factor repeated measures design with replicates is needed

Hi,

my design is like:

SubjectCondition1Condition2Condition3Condition4Condition5
1

replicate1

replicate2

replicate3

...

replicate15

replicate1

replicate2

replicate3

replicate1

replicate2

replicate3

replicate1

.

.

replicate1

replicate2

replicate3

2

replicate1

replicate2

replicate3

...

replicate15

replicate1

replicate2

.

replicate1

replicate2

replicate3

replicate1

replicate2

replicate3

replicate1

replicate2

replicate3

3

replicate1

replicate2

replicate3

...

replicate12

replicate1

replicate2

replicate3

replicate1

replicate2

replicate3

replicate1

replicate2

.

replicate1

replicate2

replicate3

...
n

Under the "replicate" I mean what is mentioned here and here.

For most subjects/conditions I have 3 replicated, for some ---- only 2 or even 1 (because of outliers).

Each subject's parameter of interest was measured in 5 conditions (not times).

For the similar design but with one "replicate" only I was advised by the following code:

PROC GLIMMIX DATA = ff_long_sorted ORDER = DATA MAXOPT = 500 PCONV = 1E-8;

  VALUEp = VALUEE/100;

  CLASS ExpID Condition;

  MODEL VALUEp = Condition / DISTRIBUTION = BINOMIAL DDFM = KENWARDROGER;

  RANDOM Condition / RESIDUAL SUBJECT = ExpID TYPE = CSH;

  *RANDOM _RESIDUAL_ / SUBJECT = ExpID TYPE = CSH;

  NLOPTIONS TECHNIQUE = NMSIMP MAXITER = 500;

  LSMEANS Condition / ADJDFE = ROW DIFF ILINK ADJUST = TUKEY CL

                      PLOTS = DIFFOGRAM(NOABS CENTER);

  ODS SELECT ConvergenceStatus FitStatistics Tests3 DiffPlot;

RUN;

But what about the situation when I have 1-3 replicates per subject/condition?

Thank you in advance.

Respected Advisor
Posts: 2,655

Re: PROC GLIMMIX code for a single factor repeated measures design with replicates is needed

Here replicates provide an additional source of variability, and are a within-subject source.  Thus modify the code to:

PROC GLIMMIX DATA = ff_long_sorted ORDER = DATA MAXOPT = 500 PCONV = 1E-8;

  VALUEp = VALUEE/100;

  CLASS ExpID Condition Replicate;

  MODEL VALUEp = Condition / DISTRIBUTION = BINOMIAL DDFM = KENWARDROGER;

  RANDOM Condition / RESIDUAL SUBJECT = ExpID TYPE = CSH;

  RANDOM Replicate/ SUBJECT = ExpID*Condition;

  NLOPTIONS TECHNIQUE = NMSIMP MAXITER = 500;

  LSMEANS Condition / ADJDFE = ROW DIFF ILINK ADJUST = TUKEY CL

                      PLOTS = DIFFOGRAM(NOABS CENTER);

  ODS SELECT ConvergenceStatus FitStatistics Tests3 DiffPlot;

RUN;

You may want to change to method=laplace to get conditional estimates, rather than the marginals which are known to be biased.  That code would look like:

PROC GLIMMIX DATA = ff_long_sorted ORDER = DATA method=laplace;

  VALUEp = VALUEE/100;

  CLASS ExpID Condition Replicate;

  MODEL VALUEp = Condition / DISTRIBUTION = BINOMIAL;

  RANDOM Condition / SUBJECT = ExpID TYPE = CSH;

  RANDOM Replicate/ SUBJECT = ExpID*Condition;

  NLOPTIONS TECHNIQUE = NMSIMP MAXITER = 500;

  LSMEANS Condition / ADJDFE = ROW DIFF ILINK ADJUST = simulate CL

                      PLOTS = DIFFOGRAM(NOABS CENTER);

  ODS SELECT ConvergenceStatus FitStatistics Tests3 DiffPlot;

RUN;

I also moved to a different adjustment (Edwards and Berry's simulation method as opposed to Tukey) as it provides better control of experiment-wise error rates.

Steve Denham

Contributor
Posts: 57

Re: PROC GLIMMIX code for a single factor repeated measures design with replicates is needed

Steve,

following the idea of modelling repeated measures data in REplicates I've simulated log-normally

distributed data (with known arithmetic mean and SD) [1] and tried to implement your suggestions.

The data set and code are attached.

I changed distribution to lognormal as there is some evidence in the literature for that ([2]). I (naively)

guess than CSH is an adequate variance-covariance matrix type as different conditions may cause

different dispersion/variance. (Right?) The number of experiments (four) was chosen as we usually

have 3-5 experiments.

I checked different optimization techniques in the NLOPTIONS statement. With the simulated data

and the code I get strange output ---- negative values in the "Fit Statistics", strange numbers in the

"Fit Statistics for Conditional Distribution", empty cells in the "Covariance Parameter Estimates",

"0.0" values in "Pearson Chi-Square / DF" and large F-values. Sometimes the SAS System stopped

processing because of errors, or optimizations cannot be completed, etc.

To have a balanced data set I also tried only TRIplicates in my dependent variable. But with no

success. As well as for monoplicates (introduced in the PROC GLIMMIX as the means for REplicates).

How to handle this kind of data sets?

Sincerely,

Stan




-----------------

P.S.

References:

[1] thanks to 's post/replies at

How to generate random numbers in SAS - The DO Loop

[2]

<1> "The logarithmic transformation and the geometric mean in reporting experimental

IgE results..."

http://www.annallergy.org/article/S1081-1206(10)60595-9/abstract

2\ Figure_S1.tif at

http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0046423

3\ "Cytokine data were log-normally distributed. The values were therefore expressed

as geometric means ± standard errors of the means"

http://iai.asm.org/content/73/6/3462.full

(4\ http://www.biomedcentral.com/1471-2172/8/27)

5\ "Statistical analyses were performed using SAS 9.1.3 software (SAS Institute Inc.,

Cary, NC, USA). Cytokine data were log-transformed due to the non-normal distribution

of plasma cytokines"

http://arthritis-research.com/content/11/5/r147

6\ "Because cytokine and chemokine data showed skewing from the normal distribution,

statistical analyses were completed after logarithmic (base 10) transformation of

data, which established a normal distribution. ... Values of zero were converted to 1

before logarithmic transformation for statistical analysis. Data are presented in the

figures and tables as the mean±SE of the log10 values of individual cytokines and

chemokines or of their ratios. To enable comparisons with other studies, we also

provide the geometric mean values after transformation back from the log10 value"

http://jid.oxfordjournals.org/content/184/4/393.long#sec-1

Attachment
Respected Advisor
Posts: 2,655

Re: PROC GLIMMIX code for a single factor repeated measures design with replicates is needed

One thing that is important to note is that for the lognormal distribution the mean and variance are functionally independent.  Given that, I would move back to the pseudo-likelihood method, and try (untested):

TITLE "----- GLIMMIX for REplicates -----";
PROC GLIMMIX DATA = REplicates ORDER = DATA;
CLASS EXP CONDITION REPLICATA;
MODEL VALUEE = CONDITION / DISTRIBUTION = LOGNORMAL;            
RANDOM CONDITION / SUBJECT = EXP;/* TYPE = CSH;  For this, I would fit a simpler variance component only model */             
RANDOM REPLICATA / residual type=ar(1) SUBJECT = EXP*CONDITION ; /* Fit marginal model, with AR(1) for repeated factor*/
NLOPTIONS MAXITER = 2000;
  /* if TECHNIQUE =
  DBLDOG,NMSIMP,NEWRAP,NRRIDG then optimizations cannot be completed
  NONE,QUANEW,CONGRA,QUANEW then empty cells | negatives in "Fit Statistics" | "0.0" value for the "Pearson Chi-Square / DF".
  LEVMAR then the SAS System stopped processing because of errors.
  */
LSMEANS CONDITION / DIFF ILINK ADJUST = SIMULATE CL PLOTS = DIFFOGRAM(NOABS CENTER);
RUN;

I would apply the same model for triplicates.  Note that ILINK will still report on the log scale, with dist=lognormal.  You can get geometric means by using the EXP option, or you can get backtransformed least squares means on the original scale using the formulas in the documentations (search for the omega symbol in the DIST= option material).

Steve Denham

SAS Super FREQ
Posts: 3,304

Re: PROC GLIMMIX code for a single factor repeated measures design with replicates is needed

Stan,

You might be interested in this blog post that I wrote that is based on our discussion: Simulate lognormal data with specified mean and variance - The DO Loop

Ask a Question
Discussion stats
  • 4 replies
  • 527 views
  • 8 likes
  • 3 in conversation