Hello,
Please I need help with my data structure and calculating adjusted incidence rate with CI to output on a linear trend graph. Please see below for my current data structure which is a unique ID level dataset with millions of records:
ID YEAR EVENT FOLLOW-UP
1 2006 1 50
2 2006 0 100
3 2014 1 60
4 2006 1 20
5 2011 1 110
6 2006 0 10
7 2008 1 150
Below is the dataset I want in order to calculate incidence rate per person-time (i.e. follow-up x # of persons), including 95% CIs with Poisson regression:
YEAR #OF PERSONS EVENT FOLLOW-UP
2006 4 2 720
2008 1 1 150
2011 1 1 110
2014 1 1 60
Please any help with this and how to output the incidence rate and 95% CI unto a linear trend line plot (x-axis= year and y-axis = incidence rate).
This gets you the data set that you want, and it also provides the plot. You're probably going to need to do the regression to get those results that you need.
proc sql;
create table want as
select
year,
count(id) as num_persons,
sum(event) as event_sum,
sum(follow_up) * calculated num_persons as person_time
from
have
group by
year;
quit;
proc sgplot data = want;
series x = year y = person_time;
run;
Obs year num_persons event_sum person_time 1 2006 4 2 720 2 2008 1 1 150 3 2011 1 1 110 4 2014 1 1 60
Note, you could also use PROC SUMMARY or PROC MEANS, and maybe even PROC UNIVARIATE to get the same results. It just seemed a little easier in my head to do it with PROC SQL since it required some calculations. The regression may output plots if you request it.
In your output, how is the 720 on row 1 calculated from the original data?
It is calculated as the number of days since the cohort entry year until last day of follow up for each person, and then summed up for all persons in each year. So I'm wondering, first, how do I aggregate the total persons each year so I get a year-level dataset.
Hello,
Please I need help with my data structure and calculating adjusted incidence rate with CI to output on a linear trend graph. Please see below for my current data structure which is a unique ID level dataset with millions of records:
ID YEAR EVENT FOLLOW-UP
1 2006 1 50
2 2006 0 100
3 2014 1 60
4 2006 1 20
5 2011 1 110
6 2006 0 10
7 2008 1 150
Below is the dataset I want in order to calculate incidence rate per person-time (i.e. follow-up x # of persons), including 95% CIs with Poisson regression:
YEAR # OF PERSONS EVENT PERSON-TIME
2006 4 2 720
2008 1 1 150
2011 1 1 110
2014 1 1 60
Please any help with this and how to output the incidence rate and 95% CI unto a linear trend line plot (x-axis= year and y-axis = incidence rate).
This gets you the data set that you want, and it also provides the plot. You're probably going to need to do the regression to get those results that you need.
proc sql;
create table want as
select
year,
count(id) as num_persons,
sum(event) as event_sum,
sum(follow_up) * calculated num_persons as person_time
from
have
group by
year;
quit;
proc sgplot data = want;
series x = year y = person_time;
run;
Obs year num_persons event_sum person_time 1 2006 4 2 720 2 2008 1 1 150 3 2011 1 1 110 4 2014 1 1 60
Note, you could also use PROC SUMMARY or PROC MEANS, and maybe even PROC UNIVARIATE to get the same results. It just seemed a little easier in my head to do it with PROC SQL since it required some calculations. The regression may output plots if you request it.
2006 should 180, not 720 right?
Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.