BookmarkSubscribeRSS Feed
Calcite | Level 5

I am attempting to compare the resistance to an intestinal parasite for two breeds of goat (I and J). Resistance is diagnosed by counting the numbers of parasite eggs in a fixed quantity of manure (a dirty job, but someone has to do it). This is a field study, with animals of the two breeds found on different farms. The number of farms is relatively large but the number of goats per farm is small to modest (522 total goats on 68 total farms; numbers of farms per breed and goats per farm are unequal, but not terribly so). Male and female goats were present on most (57 of 68) farms. The resulting design is a split-split plot or, perhaps more accurately, a split-plot with repeated observations (2 each) on the animals that make up the sub-plot error.


For a Gaussian variable, the ANOVA table would be:

                Effect                                    d.f.                         Test Term                            Comments

                Total                                      1043

                Breed (B)                                    1                      Farm(B)

                Farm (F) (breed)                  66                       Animal(B-F-S)

                Sex (S)                                       1                       Animal(B-F-S)

                B x S                                             1                       Animal(B-F-S)

                Animal(B-F-S)                      452                       Residual

                Time (T)                                     1                       Residual

                B x T                                             1                       Residual

                S x T                                             1                       Residual

                B x S x T                                      1                       Residual

                Residual                                 518                                                                       F(B) x T + A(B-F-S) x T


There is no “true” error term, since there is no replication for animals measured at a specific time. The residual is therefore the combined effects of the F(B) x T and A(B-F-S) x T interactions.


The data is clearly not Gaussian, with some amount of clumping at zero (approximately 20% of the observations are zero), but also relatively strong right skewness. That pattern is typical of these sorts of data. Traditionally data like these have been log-transformed and then analyzed as a Gaussian variable. However, we have been challenged to use PROC GLIMMIX in SAS to fit the data as a Poisson or negative binomial variable.


The data do not fit a Poisson distribution. If I start with only the fixed effects, the data are very strongly over-dispersed relative to a Poisson. This is also a typical result for these sorts of data. Stepwise addition of random farm and animal effects reduce, but do not remove, the over-dispersion. Fitting a negative binomial allows estimation of a scale parameter that assists in handling the over-dispersion. However, Walt Stroup’s 2013 book on “Generalized Linear Mixed Models” seems to indicate that attempting to fit BOTH a negative binomial scale parameter and random effects of farm and animal places me on pretty thin ice and is not recommended. Yet the farm effects, in particular, are a critical part of the design and, it seems to me, have to be in the model. I have more options for the animal effects (e.g., analyzing the sum of the two observations).


Based on changes in likelihood (AIC, BIC), step-wise addition of random farm and animal effects to the negative binomial model appears to improve goodness of fit but resulting least-squares means for the fixed effects change relatively dramatically and, it seems to me, unrealistically, if I fit effects of both farm and animal. Having only 2 repeated observations does not seem to be helping; in particular, SE of LS means for effects and interactions involving time get really small when I fit animal effects. Other data sets, with larger numbers of repeated observations (but no farm effects) seem to yield reasonable results. The ice seems to be getting thinner and thinner.


The statements I am using for the negative binomial analyses are shown below. The first random statement seems to be ok but I’m not sure that the second random statement is correct to fit the random effect of animal(breed*farm*sex). Any comments?


proc glimmix data=epg method=laplace;

title PROC GLIMMIX - Negative  binomial;

nloptions maxiter=200;

class sex breed farm animal time;                                                                                       

model fec=breed sex time breed*sex breed*time sex*time breed*sex*time/dist=negbin;

random intercept/subject=farm(breed);

random intercept/subject=animal(breed*farm*sex);

lsmeans breed sex breed*sex/pdiff ilink;



Does this code do what I think it is doing? And what sort of limitations are there for fitting random effects for distributions with scale parameters? Suggestions will be appreciated.

Lapis Lazuli | Level 10

i would use nlmixed, here is a relevant paper:, it covers random effects


sorry, im rushed and cannot get more into it, it sounds like an interesting study and design

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 2 in conversation