10-04-2016 03:46 PM
I am trying to develop a spatiotemporal logistic regression model to predict the presence/absence of a disease in U.S. counties (contiguous U.S.) based on climatologic variables, with data points for each year between 2007 and 2014; ideally, I would like a model with functionality to score additional datasets, e.g., use the model developed for 2006-2014 to predict disease probability in future climate scenarios. The model needs to account for spatial autocorrelation, and (ideally) temporal correlation as well. Unfortunately, my SAS abilities are not up to the task. Would anyone have suggestions for developing the model? The data take the form of:
countyFIPS year outcome predictor1 predictor2 predictor3 latitude longitude
countyFIPS = unique 5-digit identifier for U.S. counties
Outcome = at least one case in the county for the given year, coded 0/1
I'm really bad at this, so please be gentle and use small words...
10-04-2016 04:01 PM
See the link above as well, take a look at lexjansen.com and see what papers are available. There may even be code
10-05-2016 09:27 PM
For logistic analysis with spatial correlation, or temporal correlation, you will need to use a generalized linear mixed model (GLMM), with correlation structure for the G matrix. This is a very complex problem, and there is no way to give easy cookbook answers. I think you should get a copy of the GLMM book by Walter Stroup. It will be difficult for you, but there are examples. The book SAS for Mixed Models, 2nd edition, has a lot on this, but for normal data.