Hi, I'm analyzing a complex, hierarchial dataset examining the habitat selection of animals. I'm using an analysis procedure that examines habitat selection by generating random GPS locations and pairing them with the actual animal location to model the probability of an animal using a resource. To start, I developed quadratic terms because animals often avoid the lowest and highest values associated with a given landscape feature. When modeling higher-order terms (i.e., quadratic) it is necessary to also include lower-order terms in the model. In the case of modeling a quadratic polynomial, the lower-order (linear) term represents the overall effect of the covariate; without including the linear term the covariate effect will be depicted as a monotonically increasing or decreasing parabola with minimum or maximum values at the origin (Darlington 1990). I also natural log-transformed road distance to allow for a decreasing magnitude of influence with increasing distance (i.e., non-linear association). To assure that a natural log transformation was not attempted on a cell with a value = 0, I added 0.1 to all original values (new = log(original + 0.1)).
I ran a simple analysis this morning examining the slope and slope_quad (original value*original value) , elevation and elevation_quad (original value*original value) , distance to nearest road and log_distance to nearest road to see which model fit the data best (AIC model selection). Upon completing this simple analysis, I used the lowest AIC models to build a full model (see below):
PROC GLIMMIX DATA=HABITAT;
CLASS ID YEAR EXPOSURE TREAT HABVALUE;
MODEL VALUE (EVENT = '1') = TREAT EXPOSURE HABVALUE ROAD_LOG ELEVATION ELEVATION_QUAD SLOPE SLOPE_QUAD TREAT*EXPOSURE*HABVALUE TREAT*EXPOSURE*ROAD_LOG TREAT*EXPOSURE*ELEVATION TREAT*EXPOSURE*ELEVATION_QUAD TREAT*EXPOSURE*SLOPE TREAT*EXPOSURE*SLOPE_QUAD / DIST=BINARY LINK=LOGIT SOLUTION;
RANDOM ID(TREAT) TREAT YEAR /TYPE=VC;
RANDOM DTID / TYPE = AR(1) SUBJECT=ID;
RANDOM ELEVATION SLOPE ROAD HABVALUE/ TYPE=VC SUBJECT=ID;
RUN;
ID = Animal Identification (unique value)
YEAR = 2008 AND 2009
EXPOSURE: Initial and Prolong
TREAT: Control, Low, and High
HABVALUE: (1: Mixed forest/grassland; 2: Forest; 3: Grassland)
RANDOM EFFECTS:
RANDOM ID(TREAT) TREAT YEAR /TYPE=VC;
/*MEANING SELECTION OF RESOURCES MADE BY A DEER ARE MORE SIMILAR OR CORRELATED WHEN EACH TREATMENT; TREATMENTS ARE SIMILAR FROM YEAR TO YEAR (ASSUMING THEY HAVE THE SAME INFLUENCE EVEN WHEN TREATMENTS WERE RANDOMLY ASSIGNED IN YEAR 2); YEARS ARE MORE SIMILAR THAN BETWEEN THE 2 YEARS*/
RANDOM TIME / TYPE = AR(1) SUBJECT=ID;
/*NEED TO HAVE A COLUMN THAT IS A CONTINUOUS VARIABLE THAT IS A DATE AND TIME INDICATOR (MERGE DATE AND TIME INTO 1 DATE/TIME STAMP, I CREATED THIS USING SAS); MODEL WITH AR(1); THIS WILL ACCOUNT FOR THE TEMPORAL AUTOCORRELATION IN THE DATASET FOR BOTH OBSERVED AND RANDOM LOCATIONS*/
RANDOM ELEVATION SLOPE DIST_ROAD HABITAT / TYPE=VC SUBJECT=ID;
/*THIS MODELS THE CORRELATION OF RESOURCE SELECTION WITHIN INDIVIDUALS - MEANING THE SELECTION OF ELEVATION BETWEEN INTERVALS IS CORRELATED WITHIN AN INDIVIDUAL; WITH TYPE=VC IT ALSO IS ASSUMING THAT THERE IS RANDOM SELECTION OF RESOURCES (IND. VARS.) AND THE IND. VARS. ARE NOT CORRELATED WITH OTHER IND. VARS.; STATED ANOTHER WAY - EACH ANIMAL HAS ITS OWN RELATIONSHIP WITH ELEVATION AND THESE RELATIONSHIPS ARE NORMALLY DISTRIBUTED AMONG ANIMAL; MODELING IT THIS WAY IS CLOSER TO ECOLOGICAL REALITY BECAUSE ANIMAL ARE A SAMPLE AND EACH ANIMAL IS USING A SAMPLE OF THE AVAILABLE ELEVATIONS*/
Unfortunately, when I run this model I continue to receive the following message:
NOTE: The GLIMMIX procedure is modeling the probability that Value='1'.
ERROR: Integer overflow on computing amount of memory required.
NOTE: The SAS System stopped processing this step because of insufficient memory.
NOTE: PROCEDURE GLIMMIX used (Total process time):
real time 17.03 seconds
cpu time 6.04 seconds
I tried running this model on another computer with more RAM but with no luck. I believe the RANDOM effects are causing the problem of insufficient memory and may need to be revised somehow. A friend of mine ran even more complex datasets than mine using a normal desktop computer but for some reason I can get the model to run correctly. Any thoughts on how to resolve my problem? Thank you very much!