GLIMMIX negative binomial distribution question

SMATT1 · Posted 03-31-2012 02:05 PM

Hello all,

This posting is a follow-up question to a previous posting regarding using PROC GLIMMIX for a simple RCBD.

Study Background: I vacuum sampled insects off of 10 wallscovered in vines and 10 adjacent blank walls during three separate months last summer. At each site 10 subsamples were taken. Study mimics an RBCD. Single treatment factor has 2 levels-green and not green. Each site is treated as a block containing both a blank and a green wall and each site contains 10 0.75 m^2 subsamples. Insect abundance data from the walls follow a non normal distribution and lack equality of variance. Thus the PROC GLIMMIX.

Based on helpful suggestions from IVM and PGSTATS I included subsamples in my model and did not normalize my data by subsample size. This resulted in data which contained many zeros for blank walls and many higher numbers for green walls. Although I originally was using a Poisson distribution, by including my raw data and my subsamples my Pierson Chi Square/DF became very high (example 9.1). I tried using a negative binomial distribution and obtained a much better fit statistic (1.15).

I basically wanted to make sure my code is correct and wanted to see if people had any comments on the use of negative binomial distribution for this kind of data. An example for one month’s sampling is below

data abundancevisit1withsub;

input blk trt$ subsample y;

lines;

1 g 1 4

1 g 2 2

1 g 3 1

1 g 4 7

1 g 5 2

1 g 6 3

1 g 7 .

1 g 8 7

1 g 9 5

1 g 10 10

1 b 11 0

1 b 12 0

1 b 13 0

2 g 1 6

2 g 2 3

2 g 3 7

2 g 4 5

2 g 5 17

2 g 6 14

2 g 7 7

2 g 8 4

2 g 9 4

2 g 10 .

2 b 11 0

2 b 12 0

2 b 13 0

3 g 1 0

3 g 2 2

3 g 3 0

3 g 4 3

3 g 5 2

3 g 6 3

3 g 7 0

3 g 8 0

3 g 9 0

3 g 10 0

3 b 11 0

3 b 12 0

3 b 13 0

4 g 1 1

4 g 2 6

4 g 3 7

4 g 4 9

4 g 5 6

4 g 6 3

4 g 7 5

4 g 8 1

4 g 9 6

4 g 10 11

4 b 11 0

4 b 11 1

4 b 11 0

5 g 1 110

5 g 2 157

5 g 3 106

5 g 4 58

5 g 5 183

5 g 6 64

5 g 7 5

5 g 8 46

5 g 9 25

5 g 10 23

5 b 11 0

5 b 12 2

5 b 13 4

6 g 1 12

6 g 2 12

6 g 3 5

6 g 4 4

6 g 5 3

6 g 6 29

6 g 7 3

6 g 8 18

6 g 9 3

6 g 10 52

6 b 11 0

6 b 12 2

6 b 13 0

7 g 1 .

7 g 2 25

7 g 3 .

7 g 4 39

7 g 5 26

7 g 6 28

7 g 7 47

7 g 8 58

7 g 9 20

7 g 10 .

7 b 11 1

7 b 12 1

7 b 13 1

8 g 1 58

8 g 2 2

8 g 3 1

8 g 4 2

8 g 5 3

8 g 6 3

8 g 7 4

8 g 8 3

8 g 9 2

8 g 10 1

8 b 11 0

8 b 12 0

8 b 13 0

9 g 1 6

9 g 2 10

9 g 3 16

9 g 4 20

9 g 5 14

9 g 6 15

9 g 7 22

9 g 8 10

9 g 9 13

9 g 10 14

9 b 11 0

9 b 12 0

9 b 13 0

10 g 1 11

10 g 2 4

10 g 3 8

10 g 4 14

10 g 5 17

10 g 6 27

10 g 7 36

10 g 8 34

10 g 9 32

10 g 10 34

10 b 11 0

10 b 12 2

10 b 13 2;

proc print data=abundancevisit1withsub;

run;

proc glimmix data=abundancevisit1withsub method=quad;

class trt blk subsample;

model y = trt / dist=negbinomial link=log;

random int trt / sub=blk;

lsmeans trt / cl ilink;

run;

PGStats · Posted 03-31-2012 04:26 PM

All seems fine to me except for the presence of trt as a random effect. There is no need that I can think of for treating levels of trt as samples from a population of treatments. I propose you use :

proc glimmix data=abundancevisit1withsub;

class trt blk;

model y = trt / dist=negbinomial link=log;

random int / sub=blk solution;

lsmeans trt / cl pdiff ilink;

run;

PG

lvm · Posted 04-01-2012 09:43 PM

Actually you do need the trt in the random statement

random int trt / sub=blk solution;

This statement gives random effects of blk and trt*blk, and the latter serves as the "experimental error" term. Without this, the procedure would confuse the subsamples for the experimental units.