Solved
Contributor
Posts: 42

# need to verify my random statement for GLIMMIX (repeated measures)

Study evaluated fecal egg counts (FEC) in lambs. 4 treatments, FEC analyzed on 2 separate days (repeated measures).

Data is not normal. If it were normal, I'd run PROC Mixed with the following:

MODEL logFEC = TRT day trt*day;

Repeated  day/subject = id(trt) type =ARH(1);

Then, if trt*day is significant, data would be re-analyzed by day.

Just looking for help to make sure the following statement is correct for GLIMMIX, to account for repeated measure.

DATA VA;  SET FEC;

LOGFEC = LOG(FEC+100);

PROC SORT; BY day TRT;

PROC PRINT;

RUN;QUIT;

PROC GLIMMIX;

CLASS TRT day id;

MODEL logFEC = TRT day trt*day;

Random day / subject = id(trt) type =ARH(1) residual;

RUN;QUIT;

ALSO: what would be the difference in the above PROC GLIMMIX statement vs. the one below:

PROC GLIMMIX;

CLASS TRT day id;

MODEL logFEC = TRT day trt*day;

Random _Residual_ /subject = ID(trt) type =ARH(1);

Thanks in advance for any help.

Accepted Solutions
Solution
‎01-23-2013 10:23 AM
Posts: 2,655

## Re: need to verify my random statement for GLIMMIX (repeated measures)

Back to the zeros question, the occurrence doesn't look like there is zero inflation to any great extent.

I ran the analysis several ways: overdispersion R-side and conditional G-side, using a negative binomial distribution  As points out for non-normal distributions, you come to radically different conclusions.  The overdispersed model resulted in no significances when the Kenward-Rogers correction is applied, while the G-side model and the uncorrected R-side gave significant main effects, but nonsignificant interaction.

Further, if you choose to use a G-side model, you may wish to use Gaussian-hermite quadrature as an optimization method, rather than the default restricted maximum pseudo-likelihood.  This may be critical because the variance for the count related distributions (Poisson and negative binomial) influences the estimates, and needs to be included in the optimization and not profiled out as in the pseudo-likelihood methods.  Someplace that I can't find right now has some references on this. And this version of the analysis gives a "nearly significant" test for day, but nonsignificant for treatment and treatment by day interaction.

Steve Denham

All Replies
Posts: 2,655

## Re: need to verify my random statement for GLIMMIX (repeated measures)

Hopefully some others will chime in on the differences between the two specifications.  I would think that the first is the standard repeated measures type of parameterization, with a heterogeneous autoregressive error structure, implying different variances on the days.  The second models overdispersion, and I'm coming up blank on how the error structure is modeled over day.  Perhaps or or can help out.

But that's not really what I want to talk about.  The power of GLIMMIX is the ability to specify error distributions appropriate to the variable at hand.  Here it is a count.  Almost certainly it is a negative binomial, as I would guess that the variance is not equal to the mean (Poisson assumption).  I would approach this as:

PROC GLIMMIX data=yourdata;

CLASS TRT day id;

MODEL FEC = TRT day trt*day/dist=negbin ddfm=kr2;

Random day / subject = id(trt) type =ARH(1) residual;

RUN;

The following have been entered:

dist=negbin, to model a negative binomial distribution for egg counts.  I would also make sure that this variable is the actual counts, rather than the counts/100, unless there are convergence problems.

ddfm=kr2, to apply a Kenward-Rogers adjustment to the denominator degrees of freedom to accommodate for the repeated measures

slicediff=day: to get simple effect comparisons between treatments at each day, if there is a significant trt*day interaction.  Doing a separate by day analysis if the interaction is significant is like chopping off one foot before running a marathon.  You have the ability to accommodate the actual design with GLIMMIX.

If there are only two days observed for each id(trt), then the following covariance structures will be equivalent: ARH(1), CSH, UN, CHOL, UNR and any of the spatial correlation structures.  I would use the unstructured Cholesky parameterization (CHOL) to model unequal variances by day.

As a last comment, do you have a lot of zeros in the data?  This might be the case if one or more of the treatments is exceedingly effective at killing worms, resulting in a zero shed rate, or if this study is dealing with natural infections, where some of the lambs have no worms and thus shed no eggs.  If there is zero inflation, then we probably need to reconsider everything about the analysis.

Hope this helps, and let us know how it turns out.

Steve Denham

Contributor
Posts: 42

## Re: need to verify my random statement for GLIMMIX (repeated measures)

Many thanks for the quick response.

Yes... zero's are in the data (see below; sorted by day, trt, FEC).

 ID PENuse TRTuse DAY FEC 9377 R1b CNTLivm 37 0 9430 R1b CNTLivm 37 0 9440 R1b CNTLivm 37 0 9516 R5b CNTLivm 37 0 9425 R1b CNTLivm 37 50 9433 R3b CNTLivm 37 50 9417 R7b CNTLivm 37 150 9521 R3b CNTLivm 37 150 9391 R5b CNTLivm 37 200 9413 R3b CNTLivm 37 250 9361 R7b CNTLivm 37 350 9422 R5b CNTLivm 37 400 9491 R5b CNTLivm 37 400 9421 R7b CNTLivm 37 600 9511 R3b CNTLivm 37 950 9506 R7b CNTLivm 37 1400 9399 R1a CNTLno 37 50 9441 R1a CNTLno 37 50 9515 R3a CNTLno 37 100 9407 R1a CNTLno 37 150 9369 R5a CNTLno 37 200 9431 R5a CNTLno 37 200 9475 R3a CNTLno 37 250 9402 R3a CNTLno 37 350 9455 R1a CNTLno 37 350 9439 R7a CNTLno 37 500 9485 R7a CNTLno 37 600 9519 R7a CNTLno 37 600 9371 R5a CNTLno 37 800 9432 R7a CNTLno 37 800 9456 R5a CNTLno 37 800 9365 R3a CNTLno 37 1150 9360 R4b JUNivm 37 0 9362 R2b JUNivm 37 0 9414 R2b JUNivm 37 0 9418 R8b JUNivm 37 0 9503 R4b JUNivm 37 0 9518 R6b JUNivm 37 0 9363 R2b JUNivm 37 50 9410 R4b JUNivm 37 50 9454 R2b JUNivm 37 50 9464 R6b JUNivm 37 50 9408 R8b JUNivm 37 100 9376 R6b JUNivm 37 200 9472 R8b JUNivm 37 200 9420 R6b JUNivm 37 250 9483 R4b JUNivm 37 250 9364 R8b JUNivm 37 450 9381 R8a JUNno 37 0 9473 R2a JUNno 37 50 9397 R2a JUNno 37 100 9426 R6a JUNno 37 100 9438 R4a JUNno 37 100 9462 R2a JUNno 37 150 9474 R4a JUNno 37 150 9453 R8a JUNno 37 200 9457 R2a JUNno 37 200 9469 R6a JUNno 37 250 9375 R4a JUNno 37 400 9388 R8a JUNno 37 500 9394 R8a JUNno 37 600 9396 R4a JUNno 37 850 9479 R6a JUNno 37 850 9379 R6a JUNno 37 1150 9430 R1 CNTLivm 47 0 9377 R1 CNTLivm 47 50 9433 R3 CNTLivm 47 50 9516 R5 CNTLivm 47 50 9521 R3 CNTLivm 47 50 9417 R7 CNTLivm 47 100 9425 R1 CNTLivm 47 100 9440 R1 CNTLivm 47 150 9361 R7 CNTLivm 47 200 9491 R5 CNTLivm 47 450 9421 R7 CNTLivm 47 650 9422 R5 CNTLivm 47 700 9391 R5 CNTLivm 47 900 9413 R3 CNTLivm 47 1000 9511 R3 CNTLivm 47 1050 9506 R7 CNTLivm 47 2750 9369 R5 CNTLno 47 50 9475 R3 CNTLno 47 50 9431 R5 CNTLno 47 100 9485 R7 CNTLno 47 100 9515 R3 CNTLno 47 100 9519 R7 CNTLno 47 100 9399 R1 CNTLno 47 150 9439 R7 CNTLno 47 250 9402 R3 CNTLno 47 300 9407 R1 CNTLno 47 350 9441 R1 CNTLno 47 600 9365 R3 CNTLno 47 800 9455 R1 CNTLno 47 800 9432 R7 CNTLno 47 850 9456 R5 CNTLno 47 950 9371 R5 CNTLno 47 1250 9360 R4 JUNivm 47 0 9363 R2 JUNivm 47 0 9408 R8 JUNivm 47 0 9362 R2 JUNivm 47 50 9414 R2 JUNivm 47 50 9418 R8 JUNivm 47 50 9518 R6 JUNivm 47 50 9454 R2 JUNivm 47 100 9503 R4 JUNivm 47 100 9420 R6 JUNivm 47 150 9472 R8 JUNivm 47 250 9483 R4 JUNivm 47 250 9376 R6 JUNivm 47 400 9410 R4 JUNivm 47 450 9364 R8 JUNivm 47 500 9464 R6 JUNivm 47 600 9381 R8 JUNno 47 0 9426 R6 JUNno 47 100 9473 R2 JUNno 47 200 9397 R2 JUNno 47 250 9438 R4 JUNno 47 250 9469 R6 JUNno 47 350 9375 R4 JUNno 47 400 9396 R4 JUNno 47 400 9453 R8 JUNno 47 400 9462 R2 JUNno 47 400 9457 R2 JUNno 47 500 9479 R6 JUNno 47 500 9379 R6 JUNno 47 600 9388 R8 JUNno 47 600 9394 R8 JUNno 47 650 9474 R4 JUNno 47 950
Valued Guide
Posts: 684

## Re: need to verify my random statement for GLIMMIX (repeated measures)

Your two versions of your original GLIMMIX code fit the same model, with same results. This is because you are using a normal distribution (the default), with an identity link function. In fact, you could fit the same model with MIXED (using the log as the response and a REPEATED statement) because of the normality. These two versions work equivalently when you have all times in all subjects. Also, as Steve indicates, with only two times, ARH(1) is not different from several other structures.

The more complex questions would arise if you took the GLMM approach and model the response (not log response) as a member of the exponential family of distributions (with a log link). Then you would need to distinguish between

Random day / subject = id(trt) type =ARH(1) residual;

and

Random day / subject = id(trt) type =ARH(1) ;

You didn't ask about this, but now you would get different results for some non-normal distributions (e.g., Poisson). Here, the first one (with RESIDUAL option) would model overdispersion directly (R-side analysis) where the second version is modeling the linear predictor conditional on the random effect (G-side analysis).

Solution
‎01-23-2013 10:23 AM
Posts: 2,655

## Re: need to verify my random statement for GLIMMIX (repeated measures)

Back to the zeros question, the occurrence doesn't look like there is zero inflation to any great extent.

I ran the analysis several ways: overdispersion R-side and conditional G-side, using a negative binomial distribution  As points out for non-normal distributions, you come to radically different conclusions.  The overdispersed model resulted in no significances when the Kenward-Rogers correction is applied, while the G-side model and the uncorrected R-side gave significant main effects, but nonsignificant interaction.

Further, if you choose to use a G-side model, you may wish to use Gaussian-hermite quadrature as an optimization method, rather than the default restricted maximum pseudo-likelihood.  This may be critical because the variance for the count related distributions (Poisson and negative binomial) influences the estimates, and needs to be included in the optimization and not profiled out as in the pseudo-likelihood methods.  Someplace that I can't find right now has some references on this. And this version of the analysis gives a "nearly significant" test for day, but nonsignificant for treatment and treatment by day interaction.

Steve Denham

Contributor
Posts: 42

## Re: need to verify my random statement for GLIMMIX (repeated measures)

I come up with the same conclusion. I run it with ddfm=KR2 and all p-values are near 1.0 (this can't be right).

I run it without KR2 and the P values "seem" to tell the story (I know P-values don't tell you anything about the model, but....)

PROC GLIMMIX;  CLASS TRTuse day id;

MODEL FEC = TRTuse day trtuse*day/dist=negbin link=log;

Random day / subject = id(trtuse) type =ARH(1) residual;

Also: when I use Laplace vs. QUAD method, the model fits better. I don't understand the differences, but I think I should go with Laplace.

Again, thanks for all the help.

🔒 This topic is solved and locked.