09-11-2014 01:11 PM
Parameter estimates change sign.
Hi, any help/advice on this would be greatly appreciated. I counted the number of bats present at a site and want to see if any weather variables correlate with number of bats counted. I expect the parameters most likely to predict number of bats to be wind and temperature and the interaction between wind and temperature. Approximately 70% the count data consists of zeroes which are nights when no bats were counted. I expect number of bats to be high on nights with low winds and high temperature, to be lower during low winds and low temperatures and also lower during high winds and high temperatures. I ran simple Poisson GLMMs with Glimmix with year as a random variable and the overdispersion term _residual_ in the random statement due to a large proportion of zeroes in my count data. When I run a simple Bat Count = Wind model, wind is not significant (p=0.8) and the parameter estimate for wind is negative (as expected). When I run the model with Bat Count = Temperature, Temperature is not significant (p=0.2) and the parameter estimate for Temperature is positive (as expected). When I run the model Bat Count = Wind + Temperature, Wind (p=0.9) and Temperature (p=0.2) are not significant and their parameter estimates are negative and positive respectively (as expected). However, when I run the model Bat Count = Wind + Temperature + Wind*Temperature, Wind (p=0.008), Temperature (p=0.003) and Wind*Temperature (p=0.01) are significant but the parameter estimate for wind is now positive and the parameter estimate for the interaction is negative. However, when I plot the predicted count values of this model against the values for wind, the relationship between bat count and wind is negative even though the parameter estimate is positive.
I do not understand why the parameter estimate changes from negative (expected) to positive (not expected) and why Wind and Temperature alone were not significant. Could someone please enlighten me?
09-11-2014 10:31 PM
Several issues, but I will focus on interactions. Main effects are very hard to interpret when interactions are significant. You have to look at the whole deterministic part of the model. Lets assume independent predictors (not realistic). Your full model is
Y = a + bW + cT + dWT
For your case, Y is the log of the expected value (the link function for Poisson). But this model can be rewritten as:
Y = a + cT + (b + dT)W
So, in a sense, it doesn't matter that b is positive when you expect negative, What matters is the sign and magnitude of b+dT (the slope for W that depends on T). You have to look at b+dT at selected typical values of T. I expect this will be negative. But the point is that the effect of W on Y depends on T. a+cT become the intercept at selected Ts.
W and T likely are correlated, maybe strongly correlated. This can have a big effect on model fitting (parameter estimation). My simple interpretations above don't quite hold if T is related to W, so that one cannot consider them independently. But you get close.
With a lot of 0s, I am not crazy about simple multiplicative adjustments for overdispersion. I suggest you try negative binomial instead. That way you could also use method=laplace, which allows you to determine goodness of fit.
09-12-2014 12:56 PM
I am not a GENMOD person, as I learned generalized linear models coming from a mixed models background, but this may be a case for using GENMOD. I don't see any random effects in this presentation. So I would try:
proc genmod data=yourdata;
model count=wind temperature wind*temperature/dist=negbin type3;
zeromodel wind temperature wind*temperature/link=logit;
Or, if you have SAS/ETS licensed, try PROC COUNTREG. Example 11.2 in the SAS/ETS 13.2 documentation gives a great step-by-step walk through of how to fit zero inflated models.