Hello everyone, First post, but long-time SAS user! I've been having trouble with proc genmod. It's stumped a few people in our department, so I thought I would take it online. I am on SAS version 9.3. I will briefly give a background on our study. We are examining motor vehicle fatalities (varname pvofat2) at the county level. One of the predictor variables we want to look at is a certain type of seat belt law in effect in each state. We are using a four-level categorical variable (varname lawall). As we care about the rate per person, it is necessary to use the natural log of the population in each county as an offset (varname lnpop). To begin, I am simply trying to estimate crude rates/rate ratios for each type of law. When I run the following model with a Poisson distribution, it works as I would expect: *Model to get p-value by law only (rate ratios and rate estimate for prim/prim county type, poisson distribution);
proc genmod data=totrate; class lawall/param=glm;
model pvofat2 =LAWALL/offset=lnpop dist=poisson link=log type3;
lsmeans lawall/ilink exp;
ods output lsmeans=myfile;
run; These results match what I calculate by hand in excel. The p-values match what I get from Open-Epi (an online app to perform basic statistics, including chi-square tests). However, when I run the exact same code using a negative binomial distribution, things are off. The rate estimates are off significantly. Rate ratio estimates are off as well (judging from exponentiated model parameters). Here is the code: *Model to get p-value by law only (rate ratios and rate estimates, negbin); proc genmod data=totrate;
class lawall/param=glm;
model pvofat2 =LAWALL/offset=lnpop dist=negbin link=log type3;
lsmeans lawall/ilink exp;
ods output lsmeans=myfile;
run; As I said, the actual rate estimates for counties are off significantly. Setting a Poisson distribution gives me a rate of 7.65 per 100,000 for one level of my categorical predictor. This matches what I calculated by hand. The same estimate using a negative binomial distribution is 13.68 per 100,000. According to the output log, my model is converging. Exploring things a bit by myself, I think the problem has something to do with the offset option. Whenever I don’t use an offset, the model should be estimating the average number of deaths by county for each type of law. These numbers match in poisson models, negative binomial models, and when calculating by hand: *Model to get p-value by law only (average fats/county, NB);
proc genmod data=totrate;
class lawall/param=glm;
model pvofat2 =LAWALL/dist=negbin link=log type3;
lsmeans lawall/ilink exp;
ods output lsmeans=myfile;
run; Getting identical (and correct) results without the offset option seems fishy to me. Do I need to use a different offset option in negative binomial options? Is offsetting not possible in NB models? Are there other things I have not yet considered? I've tried running these models on different variables as well, and I'm consistently getting the same problem. One last thing I will note is that the dispersion parameter is <1 (underdispersion). Unless I'm mistaken, the point estimates should be the same in Poisson and negative binomial models, correct? I'm really just rambling at this point. I would greatly appreciate any help anyone had. Output available upon request. EDIT: to organize up code and correct one error in code comments.
... View more