- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi, I'm having trouble calculating adjusted incidence rates per year using proc genmod (SAS version 9.4). The issue is, the results from the Estimate statements from my adjusted model are completely different (like, 5 decimal places larger) than the unadjusted incidence ratios that I get when I make the calculations by hand and I assume I'm adjusting incorrectly.
Here is a sample of my data:
Obs |
ID |
Event |
yrstoEvent |
covar1 |
covar2 |
covar3 |
covar4 |
covar5 |
Covar6 |
yearofevent |
|
1 |
1 |
1 |
0.01644 |
3 |
2 |
3 |
3 |
2 |
1 |
2014 |
|
2 |
2 |
0 |
1.36712 |
3 |
2 |
3 |
2 |
2 |
0 |
2015 |
|
3 |
3 |
1 |
0.44932 |
2 |
2 |
3 |
2 |
2 |
2 |
2016 |
|
4 |
4 |
1 |
0.04658 |
3 |
2 |
3 |
2 |
2 |
0 |
2015 |
|
5 |
5 |
1 |
0.33425 |
1 |
2 |
1 |
2 |
2 |
0 |
2012 |
|
6 |
6 |
0 |
2.57534 |
1 |
2 |
1 |
2 |
2 |
0 |
2012 |
|
7 |
7 |
0 |
2.95068 |
2 |
2 |
1 |
2 |
2 |
1 |
2011 |
|
8 |
8 |
0 |
0.16986 |
3 |
2 |
3 |
3 |
2 |
0 |
2014 |
|
9 |
9 |
1 |
0.03836 |
2 |
1 |
1 |
2 |
2 |
0 |
2011 |
|
10 |
10 |
0 |
0.18904 |
2 |
2 |
3 |
2 |
2 |
0 |
2014 |
I first used proc sql to calculate the number of events and amount of person-time for each strata of the covariates, using this code:
proc sql;
create table adjtable
as
select sum(Event) as sumEvent,count(unique id) as totalpts,sum(yrstoEvent)as sumofyears,
calculated sumofyears * calculated totalpts as personyears,
covar1, covar2, covar3, covar4, covar5, covar6, yearofevent from have group by covar1, covar2, covar3, covar4, covar5, covar6, yearofevent;
quit;
Next, I calculated the offset variable:
data adjtable1;
set adjtable;
ln=log(personyears);run;
Finally, I ran this code to calculate the adjusted incidence per year of event:
proc genmod data=adjtable1;
class yearofEvent covar1 covar2 covar3 covar4 covar5 covar6 ;
model sumEvent= yearofEvent covar1 covar2 covar3 covar4 covar5 covar6 /dist=poisson offset=ln;
estimate "2010" intercept 1 yearofEvent 1 0 0 0 0 0 0;
estimate "2011" intercept 1 yearofEvent 0 1 0 0 0 0 0;
estimate "2012" intercept 1 yearofEvent 0 0 1 0 0 0 0;
estimate "2013" intercept 1 yearofEvent 0 0 0 1 0 0 0;
estimate "2014" intercept 1 yearofEvent 0 0 0 0 1 0 0;
estimate "2015" intercept 1 yearofEvent 0 0 0 0 0 1 0;
estimate "2016" intercept 1 yearofEvent 0 0 0 0 0 0 1;
run;
Is this code the correct way to obtain adjusted incidence rates?
Any help is greatly appreciated!
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Without knowing much about your data, I would think the person-time would just be the sum of your YRSTOEVENT variable.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Probably because the values you are getting have all of the covariates set at zero which is probably not what you want. Add the E option in your ESTIMATE statements to see the linear combination of model parameters that was used. Because of this extremely common problem of setting reasonable contrast coefficients, you should avoid using the ESTIMATE (or CONTRAST) statement and use instead the LSMEANS (or SLICE or LSMESTIMATE) statement whenever possible. Regardless of the statement you use, it is advisable to include the E option so that you can see exactly what you are estimating.
See this note that discusses and illustrates using the LSMEANS statement to estimate rates. Using the example there, the following shows that the ESTIMATE statement sets the covariate coefficient to zero, while the LSMEANS statement uses the mean of the covariate (2):
proc genmod data=insure;
class age;
model c = carnum age / dist=poisson link=log offset=ln;
estimate "Rate: age=1, small" intercept 1 age 1 0 / e;
lsmeans age / ilink e;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you! That does explain a lot and using the means rather than 0 values for the covariates brought the estimates of the means down. But even if I run the model without covariates, the estimates of the means are still about 1000 times greater than the crude incidence rate that I get when I do it by hand, which doesn't seem right. It makes me wonder if my proc sql code is wrong?
Here's the unadjusted model that still gives a huge mean estimates:
proc genmod data=adjtable1;
class yearofEvent ;
model sumEvent= yearofEvent /dist=poisson link=log offset=ln;
lsmeans yearofEvent / cl e ilink ;
run;
Thank you again, I really appreciate it!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Without knowing much about your data, I would think the person-time would just be the sum of your YRSTOEVENT variable.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you! The combination of using LSMEANS instead of estimate statements and the sum of the yearstoevent variable gave me answers much more in line with the crude incidence rates! I really appreciate it.