BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Ogee
Fluorite | Level 6

Hello,
Please I need help with my data structure and calculating adjusted incidence rate with CI to output on a linear trend graph. Please see below for my current data structure which is a unique ID level dataset with millions of records:
ID YEAR EVENT FOLLOW-UP
1    2006   1             50
2    2006   0           100
3    2014   1            60
4   2006    1            20
5   2011    1           110
6   2006    0           10
7   2008    1          150
Below is the dataset I want in order to calculate incidence rate per person-time (i.e. follow-up x # of persons), including 95% CIs with Poisson regression:
YEAR   #OF PERSONS  EVENT   FOLLOW-UP
2006           4                     2              720
2008          1                      1              150
2011          1                      1               110
2014          1                      1                 60
Please any help with this and how to output the incidence rate and 95% CI unto a linear trend line plot (x-axis= year and y-axis = incidence rate).

1 ACCEPTED SOLUTION

Accepted Solutions
maguiremq
SAS Super FREQ

This gets you the data set that you want, and it also provides the plot. You're probably going to need to do the regression to get those results that you need.

 

proc sql;
	create table 	want as
		select	
					year,
					count(id) 								as num_persons,
					sum(event)								as event_sum,
					sum(follow_up) * calculated num_persons as person_time
		from
					have
		group by
					year;
quit;

proc sgplot data = want;
	series x = year y = person_time;
run;
Obs year num_persons event_sum person_time 
1 2006 4 2 720 
2 2008 1 1 150 
3 2011 1 1 110 
4 2014 1 1 60 

maguiremq_0-1619533905912.png

 

Note, you could also use PROC SUMMARY or PROC MEANS, and maybe even PROC UNIVARIATE to get the same results. It just seemed a little easier in my head to do it with PROC SQL since it required some calculations. The regression may output plots if you request it.

View solution in original post

7 REPLIES 7
PaigeMiller
Diamond | Level 26

In your output, how is the 720 on row 1 calculated from the original data?

--
Paige Miller
Ogee
Fluorite | Level 6

It is calculated as the number of days since the cohort entry year until last day of follow up for each person, and then summed up for all persons in each year. So I'm wondering, first, how do I aggregate the total persons each year so I get a year-level dataset.

Ogee
Fluorite | Level 6

Hello,

Please I need help with my data structure and calculating adjusted incidence rate with CI to output on a linear trend graph. Please see below for my current data structure which is a unique ID level dataset with millions of records:

ID           YEAR          EVENT     FOLLOW-UP

1              2006             1                    50

2              2006             0                   100

3              2014             1                    60

4              2006             1                    20

5              2011              1                  110

6              2006              0                  10

7              2008              1                  150

Below is the dataset I want in order to calculate incidence rate per person-time (i.e. follow-up x # of persons), including 95% CIs with Poisson regression:

YEAR    # OF PERSONS      EVENT    PERSON-TIME

2006               4                        2                    720

2008               1                        1                   150

2011               1                         1                   110

2014               1                         1                   60

Please any help with this and how to output the incidence rate and 95% CI unto a linear trend line plot (x-axis= year and y-axis = incidence rate).

 

maguiremq
SAS Super FREQ

This gets you the data set that you want, and it also provides the plot. You're probably going to need to do the regression to get those results that you need.

 

proc sql;
	create table 	want as
		select	
					year,
					count(id) 								as num_persons,
					sum(event)								as event_sum,
					sum(follow_up) * calculated num_persons as person_time
		from
					have
		group by
					year;
quit;

proc sgplot data = want;
	series x = year y = person_time;
run;
Obs year num_persons event_sum person_time 
1 2006 4 2 720 
2 2008 1 1 150 
3 2011 1 1 110 
4 2014 1 1 60 

maguiremq_0-1619533905912.png

 

Note, you could also use PROC SUMMARY or PROC MEANS, and maybe even PROC UNIVARIATE to get the same results. It just seemed a little easier in my head to do it with PROC SQL since it required some calculations. The regression may output plots if you request it.

Ogee
Fluorite | Level 6
Thank you so much for sharing this code and it works great 🙂 !
Reeza
Super User
FYI - I have merged your multiple posts. Please post your question only once.
haoduonge
Quartz | Level 8

2006 should 180, not 720 right?

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 5851 views
  • 1 like
  • 5 in conversation