Given a set of phone call coordinates within a city, I'm attempting to predict one type of phone call (mcall) with another type of phone call (tcall) while controlling for the daytime population (daypop) of the city. My data is on a census tract level (n=142 census tracts, to imitate neighborhoods). Eventually, once I figure out a proper model at the most simple level, I will incorporate population characteristics as covariates to try and determine any significant characteristics on the census tract (neighborhood) level. Data Structure Example: TRACT MCALL_COUNT TCALL_COUNT DAYPOP 01234 1,256 632 6,681 02468 875 458 4,200 ... Question 1: How do I properly utilize an offset if I need to have a rate for both types of phone calls? Would I feed in the tcall_count into the model as a rate instead and take the natural log [i.e. ln(tcall_count/daypop)]? Question 2: Is the random statement setup properly to assume all census tracts are considered in the covariance of the model? Question 3: When I add in more covariates (i.e. population characteristics), I have had troubles with convergence, is this due to the model structure? Question 4: Is there a better way to model the spatial autocorrelation of the centroids of the census tracts (i.e. lat_centered lon_centered)? Here's the most simple version of what I've tried: proc glimmix data=analysis_n; ln_daypop = log(daypop);
model mcall_count = tcall_count / dist=poisson offset=ln_daypop solution;
random _residual_ / subject=intercept type=sp(exp)(lat_centered lon_centered);
run; SAS Version 9.4 M5
... View more