Thanks a lot for your in depth reply and suggestions.
Sections are determined by canopy height, 0-33%, 34-66%, and 67-100%. I then took 10 leaves randomly selected from each section based on a direction (0-360 degrees). This is part of my dissertation and my PI wasn't too keen on including section, but I feel a lot of the variation in counts can be captured by section because the insect likes the shaded parts of the plants and densities decline as canopy height increases. I've been using averages of 7020 observations (30 leaves x 9 sites x 26 intervals) for counts across the study period. This breaks down into 234 means that go into the model and looks like this (subset) for each interval.
proc sort;
by interval season climate site logarea previous_para;
proc means mean n noprint;
var total logarea;
by interval season climate site previous_para;
output out=new mean=mnqtot mnarea n=n;
run;
proc sort data=new;
by interval season climate site previous_para;
run;
proc means data=new mean n noprint;
var mnqtot mnarea;
by interval season climate site previous_para;
output out=final mean=totalfin areafin n=n;
run;
proc sort data=final;
by interval season climate site previous_para;
run;
proc print data=final;
run;
Obs
interval
season
climate
site
previous_para
totalfin
areafin
n
10
2
Warm
Coastal
1
0.660
10.67
3.09633
1
11
2
Warm
Coastal
2
0.912
306.97
4.24529
1
12
2
Warm
Coastal
3
0.495
514.60
3.05256
1
13
2
Warm
Coastal
4
0.722
352.17
2.79876
1
14
2
Warm
Coastal
5
0.426
506.83
3.52447
1
15
2
Warm
Inland
8
0.648
139.67
3.43376
1
16
2
Warm
Inland
9
0.801
73.43
2.91858
1
17
2
Warm
Interm
6
0.680
214.57
3.32450
1
18
2
Warm
Interm
7
0.429
87.10
2.69907
1
@sld wrote:
Yes, interval is a factor associated with repeated measurements on sites. That would be closer to correct, but you are not using the residual option correctly. Only RANDOM statements that specify elements of the R matrix should include the residual option; RANDOM statements that specify elements of the G matrix should not. Site and repeated measures on sites are specified in G; leaves are specified in R.
This point definitely may be the issue. I was interpreting the random statement in PROC GLIMMIX to be equivalent to the repeated statement in PROC MIXED, so anytime you had a repeated measurement it had to be included.
For example on page 9 of the advanced techniques for fitting Mixed models you cited, it says..
"Or, suppose you have the following REPEATED statement in PROC MIXED with a repeated effect of Time:
repeated Time / subject=Block type=ar(1);
You can replace that statement with the following RANDOM statement in PROC GLIMMIX:
random Time / subject=Block type=ar(1) residual;"
Removing the random statement prevents even the most simplified of this model from converging.. So you are saying if I'm not using leaf counts, but site averages, I have no R-sided effects with my current model?
Yes climate designations aren't just temperature. They are based off of southern California climate zones. Coastal (USDA Hardiness Zone 10b; Sunset Climate Zone 24), intermediate (USDA Hardiness Zone 10a; Sunset Climate Zones 20–22), and inland (USDA Hardiness Zones 9a/9b; Sunset Climate Zones 18–19).
Counts are of the pest insect (whitefly). Previous parasitism is total parasitism of the whitefly by 3 parasitic wasps observed from the previous interval that could be affecting its densities at the subsequent interval.
It looks like I still have a lot of reading to do on adding regressions/splines to a mixed model.
Thanks!
... View more