Hello everyone and thanks in advance for helping me out with my problem (or taking the time to read my post),
I am trying to calculate age-adjusted incidence rates (not ratios) for a medical procedure for each calendar year in my data set (2000-2012). See the code pasted below. For some reason when I run this, I get all zeros back for my incidence rates specified in the ESTIMATE statements. I tried exponentiating the beta coefficients myself and got the same problem (results *10^-11). Again, results included below.
Now, I know I shouldn't do this, but when I add age agesq ageneg1 to the CLASS statement my estimates seem to make sense. Still - I don't trust them since I want age modeled continuously, not categorically.
My data is in the form of aggregated count and person time (one observation for each year of age, for each calendar year). Please let me know if working in this aggregated format is the problem, because I can obtain more detailed data.
Thanks for your time! Sorry if I'm missing something very obvious. I've been searching through the documentation for SAS procedures and nothing helped me to see the error of my ways.
Here's the code I'm trying to run:
PROC GENMOD DATA=work.CountData_AgeCyear
CLASS CalYr;
MODEL TotalCount = age agesq ageneg1 CalYr
/ DIST=poi LINK=log OFFSET=log_pt SCALE=deviance;
ESTIMATE "IR: 2000" int 1 calyr 1 0 0 0 0 0 0 0 0 0 0 0 0;
ESTIMATE "IR: 2001" int 1 calyr 0 1 0 0 0 0 0 0 0 0 0 0 0;
ESTIMATE "IR: 2002" int 1 calyr 0 0 1 0 0 0 0 0 0 0 0 0 0;
ESTIMATE "IR: 2003" int 1 calyr 0 0 0 1 0 0 0 0 0 0 0 0 0;
ESTIMATE "IR: 2004" int 1 calyr 0 0 0 0 1 0 0 0 0 0 0 0 0;
ESTIMATE "IR: 2005" int 1 calyr 0 0 0 0 0 1 0 0 0 0 0 0 0;
ESTIMATE "IR: 2006" int 1 calyr 0 0 0 0 0 0 1 0 0 0 0 0 0;
ESTIMATE "IR: 2007" int 1 calyr 0 0 0 0 0 0 0 1 0 0 0 0 0;
ESTIMATE "IR: 2008" int 1 calyr 0 0 0 0 0 0 0 0 1 0 0 0 0;
ESTIMATE "IR: 2009" int 1 calyr 0 0 0 0 0 0 0 0 0 1 0 0 0;
ESTIMATE "IR: 2010" int 1 calyr 0 0 0 0 0 0 0 0 0 0 1 0 0;
ESTIMATE "IR: 2011" int 1 calyr 0 0 0 0 0 0 0 0 0 0 0 1 0;
ESTIMATE "IR: 2012" int 1 calyr 0 0 0 0 0 0 0 0 0 0 0 0 1;
RUN;
Here are the parameter estimates I get (all 0's are output from my ESTIMATE statements [not displayed]):
Parameter DF Estimate Error Limits Chi-Square Pr > ChiSq
Intercept 1 -24.3139 0.3298 -24.9603 -23.6675 5434.65 <.0001
AGE 1 0.3573 0.0068 0.3439 0.3707 2737.97 <.0001
agesq 1 -0.0018 0.0000 -0.0019 -0.0017 1798.56 <.0001
ageneg1 1 179.2731 4.6522 170.1550 188.3913 1484.96 <.0001
CalYr 2000 1 -0.0399 0.0567 -0.1511 0.0712 0.50 0.4813
CalYr 2001 1 -0.0313 0.0431 -0.1158 0.0533 0.52 0.4688
CalYr 2002 1 0.0528 0.0360 -0.0179 0.1235 2.15 0.1430
CalYr 2003 1 -0.0556 0.0316 -0.1174 0.0063 3.10 0.0783
CalYr 2004 1 0.0254 0.0282 -0.0299 0.0808 0.81 0.3678
CalYr 2005 1 0.1314 0.0265 0.0794 0.1833 24.53 <.0001
CalYr 2006 1 0.3067 0.0256 0.2565 0.3569 143.36 <.0001
CalYr 2007 1 0.3010 0.0254 0.2512 0.3509 140.00 <.0001
CalYr 2008 1 0.3772 0.0224 0.3333 0.4211 283.17 <.0001
CalYr 2009 1 0.3541 0.0224 0.3101 0.3980 249.30 <.0001
CalYr 2010 1 0.3451 0.0223 0.3013 0.3889 238.80 <.0001
CalYr 2011 1 0.2672 0.0219 0.2242 0.3101 148.81 <.0001
CalYr 2012 0 0.0000 0.0000 0.0000 0.0000 . .
Scale 0 2.0939 0.0000 2.0939 2.0939
Some other output that may be useful:
Class Level Information
Class Levels Values
CalYr 13 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009
2010 2011 2012
Criteria For Assessing Goodness Of Fit
Criterion DF Value Value/DF
Deviance 1040 4559.5819 4.3842
Scaled Deviance 1040 1040.0000 1.0000
Pearson Chi-Square 1040 4674.5441 4.4948
Scaled Pearson X2 1040 1066.2219 1.0252
Log Likelihood 157709.7869
Full Log Likelihood -5447.7431
AIC (smaller is better) 10927.4862
AICC (smaller is better) 10928.0098
BIC (smaller is better) 11006.8821
This approach will give you the estimate at age=0, agesq=0 and ageneg1=0. First off, that won't be consistent, if ageneg1 is defined as 1/age, so the value you end up predicting is meaningless. Why not shift to LSMEANS, rather than ESTIMATE? The values obtained would be at the mean values of age, agesq and ageneg. By careful use of the AT= option, you would get the best linear unbiased estimates (LSMEANS) at the specified values of age, agesq and ageneg.
If you intend to keep to the ESTIMATE statement, you will need to include plausible values of the continuous covariates in the ESTIMATE statement.
Steve Denham
This approach will give you the estimate at age=0, agesq=0 and ageneg1=0. First off, that won't be consistent, if ageneg1 is defined as 1/age, so the value you end up predicting is meaningless. Why not shift to LSMEANS, rather than ESTIMATE? The values obtained would be at the mean values of age, agesq and ageneg. By careful use of the AT= option, you would get the best linear unbiased estimates (LSMEANS) at the specified values of age, agesq and ageneg.
If you intend to keep to the ESTIMATE statement, you will need to include plausible values of the continuous covariates in the ESTIMATE statement.
Steve Denham
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.