Hi, I've been feverishly working to model habitat selection data using PROC GLIMMIX. My original model was as follows:
PROC GLIMMIX DATA=HABITAT;
CLASS ID YEAR EXPOSURE TREAT HABVALUE;
MODEL VALUE (EVENT = '1') = TREAT EXPOSURE HABVALUE ROAD_LOG ELEVATION ELEVATION_QUAD SLOPE SLOPE_QUAD TREAT*EXPOSURE*HABVALUE TREAT*EXPOSURE*ROAD_LOG TREAT*EXPOSURE*ELEVATION TREAT*EXPOSURE*ELEVATION_QUAD TREAT*EXPOSURE*SLOPE TREAT*EXPOSURE*SLOPE_QUAD / DIST=BINARY LINK=LOGIT SOLUTION;
RANDOM ID(TREAT) TREAT YEAR /TYPE=VC;
RANDOM DTID / TYPE = AR(1) SUBJECT=ID;
RANDOM ELEVATION SLOPE ROAD HABVALUE/ TYPE=VC SUBJECT=ID;
RUN;
Unfortunately, this model did not work (error: insufficient memory; read more about this issue on my previous posting
http://support.sas.com/forums/thread.jspa?threadID=14424&tstart=0
Susan commented on this model:
Re: Insufficient Memory
Posted: May 27, 2011 3:51 PM in response to: Buck1480 Reply
You've obviously given thought to the construction of your model. It's possible that the model you would like on theoretical grounds is too optimistic--in other words, you might like it to do more than it might be able to.
I agree with your suspicion: you may be getting a bit carried away with random-effects factors. Take a look at the Dimensions table, in particular the "Columns in Z" entry to get a sense of how big a task you've set for GLIMMIX.
Apparently, you have repeated locations (DTID) on each deer. I imagine the number varies by individual deer; about how many are there for each deer? How many deer did you follow?
Is there a random GPS location paired with each deer location? How is the random location "connected" to the deer location? Are the random and deer locations truly paired?
EXPOSURE, TREAT and HABVALUE appear to be experimental or quasi-experimental factors. What is the design unit (for example, ID) with which each of these factors is associated or to which a level of each factor was (randomly) assigned?
TREAT should not be in both MODEL and RANDOM statements. I presume that TREAT is a fixed-effects factor; if so, it should be omitted from the first RANDOM statement.
RANDOM ID(TREAT) implies that a level of TREAT was assigned to each ID. Is that true?
Often, but not necessarily, DTID as a repeated measures factor would be included in the MODEL statement. To be honest, I'm not sure what it means for DTID to be a continuous random effect (due to not being in MODEL) with an AR(1) covariance structure; perhaps someone else can weigh in on this point. I can imagine that you probably have a large number of unique DTID values.
The third RANDOM statement probably is dramatically increasing the size of the Z matrix. Unless you have a lot of repeated measures on each deer, the quality of the estimates of these random effects may be very low. Although you would like to estimate them, in practice it may not be possible.
You might try fitting a bare bones random structure for your model and then adding additional terms to see how far you can get. You can also compare the size of your X and Z matrices to those of your friend's model; yours may appear less complex but could actually be larger.
Keep in mind that fitting a generalized (binary) linear mixed model is not the same as taking the normal-error version and replacing dist=normal with dist=binary, because the binary mean determines the binary variance whereas the normal mean and variance are separate estimates. This distinction impacts the specifications of the random factors.
____________________________________________________________
Using some of Susan's suggestions, I modified the model to the following:
PROC GLIMMIX DATA=HABITAT; BY DN;
CLASS ID YEAR EXPOSURE TREAT HABVALUE;
MODEL VALUE (EVENT = '1') = EXPOSURE TREAT HABVALUE ROAD_LOG SLOPE SLOPE_QUAD ELEVATION ELEVATION_QUAD TREAT*EXPOSURE*HABVALUE TREAT*EXPOSURE*ROAD_LOG TREAT*EXPOSURE*ELEVATION TREAT*EXPOSURE*ELEVATION_QUAD TREAT*EXPOSURE*SLOPE TREAT*EXPOSURE*SLOPE_QUAD / DIST=BINARY LINK=LOGIT SOLUTION;
RANDOM ID ID(TREAT) TREAT(YEAR) YEAR / TYPE = VC;
RUN;
Unfortunately, I receive a different error message now:
NOTE: The GLIMMIX procedure is modeling the probability that Value='1'.
]WARNING: Pseudo-likelihood update fails in outer iteration 3.
NOTE: Did not converge.
NOTE: The above message was for the following BY group:
DN=Diurnal
NOTE: The GLIMMIX procedure is modeling the probability that Value='1'.
NOTE: Convergence criterion (PCONV=1.11022E-8) satisfied.
NOTE: Estimated G matrix is not positive definite.
NOTE: The above message was for the following BY group:
DN=Nocturnal
NOTE: PROCEDURE GLIMMIX used (Total process time):
real time 24.95 seconds
cpu time 16.96 seconds
Does anyone have any other suggestions to modify this model so I can effectively address my research questions? I'm not a season statistician so I'm unsure of what the "]WARNING: Pseudo-likelihood update fails in outer iteration 3" actually means and if there is a way to correct for this? In addition, what does "NOTE: Estimated G matrix is not positive definite" mean? How do you correct this problem? Thank you very much!