From your study description, it does not seem to me that a particular deer location is truly paired with a random location. Instead, it appears that you have a set of deer locations and a second set of random locations that you haphazardly paired together. Is that true?
If your primary objective is to determine the effects of hunting pressure on habitat selection by deer, controlling for background habitat distribution, (distance to?) road, slope, and elevation, and if deer locations are not truly paired with random locations, then you might consider a multinomial model where the response is HABVALUE, rather than a binomial model where the response is deer/random and HABVALUE is an explanatory factor.
If the random and deer locations are truly paired, then you need to build that pairing into your model. Pairing in logistic regression is not straightforward; in the literature, look for “conditional logistic regression” or “matched-pairs logistic regression” or “matched-set logistic regression” for discussions and examples. Possibly pertinent to your resource selection question is Duchesne et al 2010, J Applied Ecology 79:548-555 (and references within). Note that the two location types (deer and random) would be matched on some factors (e.g., same YEAR, TREAT, EXPOSURE, DN, DTID although the latter two don’t really make sense for random locations) but not on others (e.g., ROAD_LOG, ELEVATION, SLOPE, HABVALUE).
In addition to the pairing issue, model specification also depends upon the answers to these questions.
Were the deer marked so that you could identify individual deer? (I assume so, but please confirm.) How were they marked (e.g., GPS collars)?
How many deer did you observe? About how many times was each observed and on what schedule?
Does each deer stay within one treated area, or do deer use multiple treated areas? If they use multiple treated areas, does each deer use all three treated areas?
For fixed effects factors you have:
TREAT (hunting pressure with 3 levels); notably TREAT is not truly replicated—you’re using individual deer (ID) as replications of levels of TREAT rather than additional areas, and you’ll need to interpret the results of your study accordingly. Get TREAT out of the RANDOM statement, unless it's in an interaction with a random effects factor.
EXPOSURE with 2 levels; I presume that you have information for both levels of this factor on each ID, although I also assume that you might have only one or the other for some deer—true? Thus, ID is not the experimental/observational unit associated with EXPOSURE; rather EXPOSURE is a repeated measurement on ID (think split-plot design).
DN with 2 levels (diurnal and nocturnal); like EXPOSURE, you probably have information for both levels of this factor on each ID, again possibly missing one or the other for some deer. Again, think split-plot.
YEAR with 2 levels; do you have different deer in different years, or the same deer in both years? YEAR could be a random factor, but keep in mind that you would then be attempting to estimate variance among years based on only two years; the quality of this estimate would be quite poor. In field studies, temporal variability is a given. The nature of year is usually problematic—it’s not random (because the levels of year are not a random sample from the population of years of interest, unless you have access to a time machine), and it’s not fixed because you’re in the field because you are (I presume?) in graduate school those years, and it can’t be truly replicated (unless you can work in parallel universes). I usually think of year as either fixed (unless I have data for a lot of years), or even do a separate analysis for each year to assess “repeatability”.
ROAD_LOG, SLOPE, ELEVATION as continuous-scale factors, which may be correlated with HABVALUE levels
This study is a “quasi-experiment” where "experimental" (explanatory) factors are not randomly assigned (or even able to be assigned at all) to experimental units. Consequently, you could have problems with data distribution: the factorial defined by your categorical fixed effects could be incomplete (meaning that some combinations of factors have no observations), you could have full- or quasi-separation problems with your binomial response (which would be my first guess at the reason for convergence failure), or you could just be spread too thin in some regions of the explanatory variable space for good estimation. Or your model specification may be wrong.
As you note, it’s complex. I would start REALLY simple: Resolve the pairing issue, and decide whether to go the resource selection function route. Sort out a model without YEAR, EXPOSURE, DN, ROAD_LOG, ELEVATION terms, and SLOPE terms—I would just drop these terms from the model, rather than using them to specify a subset of the data as you’ve done with the BY statement—with a bare minimum RANDOM statement. Get that working, then build up.
HTH,
Susan
... View more