02-06-2012 06:33 PM
I'm analysing a dataset that has a binary dependent variable (bird presence or absence), several fixed factors (vegetation height, vegetation density, etc.), and both g and r-side random effects. I know that because my models contain r-side random effects that I cannot use either Laplace or QUAD approximations to get valid AIC values for model comparisons. So is there any way to compare different models? For example, I'd like to compare a model that just has overrall vegetation density vs a model that includes density of several different vegetation types to see which one best explains bird presence or absence (i.e., do we need to spend the time measuring all types of vegetation or can we lump all of them together and explain bird presence just as well). Are there any other clues for model selection that I can use, I know I can't use any of the pseudo-AIC values for model comparisons. Thanks for any help!
02-15-2012 08:45 PM
McCullagh and Nelder recommend also using the dispersion scale parameter to compare GL(M)Ms. The closer to unity the better you model and data meet the assumed residual distribution. They also suggest using the (in your case, Generalized) Chi-square(improvement)-to-DOF ratio approximation of same, which GLIMMIX produces. This statistic does not have a formal distribution and so cannot be used in a significance test of improvement with a P-value, but still can be used to identify models that, given their fixed and random effect specifications, more (or less) closely meet the (binomial logit in your case) specified distribution. Due to the both the approximation and the lack of formal test statistic distribution you would be safer to group the 'best' few together and use some other, even less formal criteria to choose amongst them (or maybe show the main inferences of interest do not materially change amongst the group).