Hello all,
This posting is a follow-up question to a previous posting regarding using PROC GLIMMIX for a simple RCBD.
Study Background: I vacuum sampled insects off of 10 wallscovered in vines and 10 adjacent blank walls during three separate months last summer. At each site 10 subsamples were taken. Study mimics an RBCD. Single treatment factor has 2 levels-green and not green. Each site is treated as a block containing both a blank and a green wall and each site contains 10 0.75 m^2 subsamples. Insect abundance data from the walls follow a non normal distribution and lack equality of variance. Thus the PROC GLIMMIX.
Based on helpful suggestions from IVM and PGSTATS I included subsamples in my model and did not normalize my data by subsample size. This resulted in data which contained many zeros for blank walls and many higher numbers for green walls. Although I originally was using a Poisson distribution, by including my raw data and my subsamples my Pierson Chi Square/DF became very high (example 9.1). I tried using a negative binomial distribution and obtained a much better fit statistic (1.15).
I basically wanted to make sure my code is correct and wanted to see if people had any comments on the use of negative binomial distribution for this kind of data. An example for one month’s sampling is below
data abundancevisit1withsub;
input blk trt$ subsample y;
lines;
1 g 1 4
1 g 2 2
1 g 3 1
1 g 4 7
1 g 5 2
1 g 6 3
1 g 7 .
1 g 8 7
1 g 9 5
1 g 10 10
1 b 11 0
1 b 12 0
1 b 13 0
2 g 1 6
2 g 2 3
2 g 3 7
2 g 4 5
2 g 5 17
2 g 6 14
2 g 7 7
2 g 8 4
2 g 9 4
2 g 10 .
2 b 11 0
2 b 12 0
2 b 13 0
3 g 1 0
3 g 2 2
3 g 3 0
3 g 4 3
3 g 5 2
3 g 6 3
3 g 7 0
3 g 8 0
3 g 9 0
3 g 10 0
3 b 11 0
3 b 12 0
3 b 13 0
4 g 1 1
4 g 2 6
4 g 3 7
4 g 4 9
4 g 5 6
4 g 6 3
4 g 7 5
4 g 8 1
4 g 9 6
4 g 10 11
4 b 11 0
4 b 11 1
4 b 11 0
5 g 1 110
5 g 2 157
5 g 3 106
5 g 4 58
5 g 5 183
5 g 6 64
5 g 7 5
5 g 8 46
5 g 9 25
5 g 10 23
5 b 11 0
5 b 12 2
5 b 13 4
6 g 1 12
6 g 2 12
6 g 3 5
6 g 4 4
6 g 5 3
6 g 6 29
6 g 7 3
6 g 8 18
6 g 9 3
6 g 10 52
6 b 11 0
6 b 12 2
6 b 13 0
7 g 1 .
7 g 2 25
7 g 3 .
7 g 4 39
7 g 5 26
7 g 6 28
7 g 7 47
7 g 8 58
7 g 9 20
7 g 10 .
7 b 11 1
7 b 12 1
7 b 13 1
8 g 1 58
8 g 2 2
8 g 3 1
8 g 4 2
8 g 5 3
8 g 6 3
8 g 7 4
8 g 8 3
8 g 9 2
8 g 10 1
8 b 11 0
8 b 12 0
8 b 13 0
9 g 1 6
9 g 2 10
9 g 3 16
9 g 4 20
9 g 5 14
9 g 6 15
9 g 7 22
9 g 8 10
9 g 9 13
9 g 10 14
9 b 11 0
9 b 12 0
9 b 13 0
10 g 1 11
10 g 2 4
10 g 3 8
10 g 4 14
10 g 5 17
10 g 6 27
10 g 7 36
10 g 8 34
10 g 9 32
10 g 10 34
10 b 11 0
10 b 12 2
10 b 13 2;
proc print data=abundancevisit1withsub;
run;
proc glimmix data=abundancevisit1withsub method=quad;
class trt blk subsample;
model y = trt / dist=negbinomial link=log;
random int trt / sub=blk;
lsmeans trt / cl ilink;
run;
All seems fine to me except for the presence of trt as a random effect. There is no need that I can think of for treating levels of trt as samples from a population of treatments. I propose you use :
proc glimmix data=abundancevisit1withsub;
class trt blk;
model y = trt / dist=negbinomial link=log;
random int / sub=blk solution;
lsmeans trt / cl pdiff ilink;
run;
PG
Actually you do need the trt in the random statement
random int trt / sub=blk solution;
This statement gives random effects of blk and trt*blk, and the latter serves as the "experimental error" term. Without this, the procedure would confuse the subsamples for the experimental units.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.