I am trying to run what should be a straightforward Poisson regression to obtain adjusted mortality rates at county-level. I must first clarify that my issue is not about choosing between a marginal (GEE) vs. subject-specific model. My data consists of the following variables: y=death counts, Ni=population at risk, sex=(M/F), Agegrp=3 age groups, and other covariates: x1=% of educated (at county level), x2=% of employed (at county level). My aim is to obtain Mortality rates/100,000 at county level based on the model consisting of the explanatory variables as shown below:
I am aiming to fit the following Poisson regression model (with an offset log(Ni) :
log(lambda_i)= bo + b1*Agegrp + b2*Sex + b3*X1 + b4*X2 + ui
where ui=county random effects.
I tried following the example from SAS help using random effects model fitted using PROC GLIMMIX (v9.4). This approach seems to do the trick, but the problem is the rates I am getting are not at county-level but rather at the level of the explanatory variables. Is there a way I could obtain these at county level? At the end I want to rank these counties by the adjusted mortality rates from the Poisson regression. Fitting a marginal mode (GEE) did not help either. I would welcome any suggestions.
The code is as below:
PROC GLIMMIX DATA=mydata;
CLASS county;
MODEL deaths = agegrp sex X1 X2 / DIST=poisson OFFSET=log_Ni S DDFM=Satterth;
RANDOM county;
My_Rate= 100000*exp(_zgamma_ + _xbeta_);
ID county deaths pop My_Rate;
OUTPUT OUT=got_you;
RUN;
Thank you Haris, I much appreciate your suggestion. I was not sure I was doing the correct thing, but as you suggested, indeed what I need are the random intercepts.
Thanks once more for the clarification it is very helpful. I was struggling/confused a bit about the correct interpretation of the random effects. You understood me correctly, it's much clearer now. For example in my case with 2-levels of hierarchy in my data, it would be enough as I am seeking differences from the national (aka sample) average in this case. It then makes sense again if I had an additional level, say e.g. Districts, then the random effects at this level will represent differences from county average.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.