I have TB case surveillance data within a county from 2006-2012. I want to run a poisson regression and calculate incidence rates to determine which years had the highest incident rates and to identify any years with large outbreaks. I just want to make sure I am doing this right in sas, I was told to use the estimate statement and have never used it before.
This is my code:
Proc genmod data = TB;
class year(param=ref ref='2006');
model CASES = YEAR/ dist=poisson OFFSET=LOGN ;
estimate '2006' year 1 0 0 0 0 0 0 0 0 0 0 0/ exp;
estimate '2007' year 0 1 0 0 0 0 0 0 0 0 0 0/ exp;
estimate '2008' year 0 0 1 0 0 0 0 0 0 0 0 0/ exp;
estimate '2009' year 0 0 0 1 0 0 0 0 0 0 0 0 / exp;
estimate '2010' year 0 0 0 0 1 0 0 0 0 0 0 0 / exp;
estimate '2011' year 0 0 0 0 0 1 0 0 0 0 0 0 / exp;
estimate '2012' year 0 0 0 0 0 0 1 0 0 0 0 0/ exp;
estimate '2013' year 0 0 0 0 0 0 0 1 0 0 0 0/ exp;
estimate '2014' year 0 0 0 0 0 0 0 0 1 0 0 0 / exp;
estimate '2015' year 0 0 0 0 0 0 0 0 0 1 0 0/ exp;
estimate '2016' year 0 0 0 0 0 0 0 0 0 0 1 0 / exp;
estimate '2017' year 0 0 0 0 0 0 0 0 0 0 0 1 / exp;
run;
LOGN is a variable I created earlier that is equal to LOG(population). I have a variable for population data of each year.
Please let me know if I am using the estimate statement correctly? The output doesn't seem to be matching the case count numbers--for instance 2012 had the highest case count numbers but it is not the highest incidence rate...
This is my output from the contrast statements
Any help is appreciated! Thank you!
A slightly different approach would be to use the n/d format in GENMOD. You could enter the "rate", numerator/denominator for each year, so that the model statement would be: model n/d=year/d=p etc. The output will give you the calculated poisson rates along with standard errors for each year. Then you could compare the yearly rates using LSMEANS.
See this note. Your ESTIMATE statements probably need to include the Intercept as shown there. Also as shown there, the easier way is to use a single LSMEANS statement instead. It is always best to avoid the ESTIMATE statement if you can instead use the LSMEANS, SLICE, or LSMESTIMATE statements which don't require you to properly define the contrasts to estimate or test.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.