Hi, I'm having trouble calculating adjusted incidence rates per year using proc genmod (SAS version 9.4). The issue is, the results from the Estimate statements from my adjusted model are completely different (like, 5 decimal places larger) than the unadjusted incidence ratios that I get when I make the calculations by hand and I assume I'm adjusting incorrectly.
Here is a sample of my data:
Obs 
ID 
Event 
yrstoEvent 
covar1 
covar2 
covar3 
covar4 
covar5 
Covar6 
yearofevent 

1 
1 
1 
0.01644 
3 
2 
3 
3 
2 
1 
2014 

2 
2 
0 
1.36712 
3 
2 
3 
2 
2 
0 
2015 

3 
3 
1 
0.44932 
2 
2 
3 
2 
2 
2 
2016 

4 
4 
1 
0.04658 
3 
2 
3 
2 
2 
0 
2015 

5 
5 
1 
0.33425 
1 
2 
1 
2 
2 
0 
2012 

6 
6 
0 
2.57534 
1 
2 
1 
2 
2 
0 
2012 

7 
7 
0 
2.95068 
2 
2 
1 
2 
2 
1 
2011 

8 
8 
0 
0.16986 
3 
2 
3 
3 
2 
0 
2014 

9 
9 
1 
0.03836 
2 
1 
1 
2 
2 
0 
2011 

10 
10 
0 
0.18904 
2 
2 
3 
2 
2 
0 
2014 
I first used proc sql to calculate the number of events and amount of persontime for each strata of the covariates, using this code:
proc sql;
create table adjtable
as
select sum(Event) as sumEvent,count(unique id) as totalpts,sum(yrstoEvent)as sumofyears,
calculated sumofyears * calculated totalpts as personyears,
covar1, covar2, covar3, covar4, covar5, covar6, yearofevent from have group by covar1, covar2, covar3, covar4, covar5, covar6, yearofevent;
quit;
Next, I calculated the offset variable:
data adjtable1;
set adjtable;
ln=log(personyears);run;
Finally, I ran this code to calculate the adjusted incidence per year of event:
proc genmod data=adjtable1;
class yearofEvent covar1 covar2 covar3 covar4 covar5 covar6 ;
model sumEvent= yearofEvent covar1 covar2 covar3 covar4 covar5 covar6 /dist=poisson offset=ln;
estimate "2010" intercept 1 yearofEvent 1 0 0 0 0 0 0;
estimate "2011" intercept 1 yearofEvent 0 1 0 0 0 0 0;
estimate "2012" intercept 1 yearofEvent 0 0 1 0 0 0 0;
estimate "2013" intercept 1 yearofEvent 0 0 0 1 0 0 0;
estimate "2014" intercept 1 yearofEvent 0 0 0 0 1 0 0;
estimate "2015" intercept 1 yearofEvent 0 0 0 0 0 1 0;
estimate "2016" intercept 1 yearofEvent 0 0 0 0 0 0 1;
run;
Is this code the correct way to obtain adjusted incidence rates?
Any help is greatly appreciated!
Without knowing much about your data, I would think the persontime would just be the sum of your YRSTOEVENT variable.
Probably because the values you are getting have all of the covariates set at zero which is probably not what you want. Add the E option in your ESTIMATE statements to see the linear combination of model parameters that was used. Because of this extremely common problem of setting reasonable contrast coefficients, you should avoid using the ESTIMATE (or CONTRAST) statement and use instead the LSMEANS (or SLICE or LSMESTIMATE) statement whenever possible. Regardless of the statement you use, it is advisable to include the E option so that you can see exactly what you are estimating.
See this note that discusses and illustrates using the LSMEANS statement to estimate rates. Using the example there, the following shows that the ESTIMATE statement sets the covariate coefficient to zero, while the LSMEANS statement uses the mean of the covariate (2):
proc genmod data=insure;
class age;
model c = carnum age / dist=poisson link=log offset=ln;
estimate "Rate: age=1, small" intercept 1 age 1 0 / e;
lsmeans age / ilink e;
run;
Thank you! That does explain a lot and using the means rather than 0 values for the covariates brought the estimates of the means down. But even if I run the model without covariates, the estimates of the means are still about 1000 times greater than the crude incidence rate that I get when I do it by hand, which doesn't seem right. It makes me wonder if my proc sql code is wrong?
Here's the unadjusted model that still gives a huge mean estimates:
proc genmod data=adjtable1;
class yearofEvent ;
model sumEvent= yearofEvent /dist=poisson link=log offset=ln;
lsmeans yearofEvent / cl e ilink ;
run;
Thank you again, I really appreciate it!
Without knowing much about your data, I would think the persontime would just be the sum of your YRSTOEVENT variable.
Thank you! The combination of using LSMEANS instead of estimate statements and the sum of the yearstoevent variable gave me answers much more in line with the crude incidence rates! I really appreciate it.
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.