Fluorite | Level 6

Count number of events in longitudinal data

Hello,

I am working with longitudinal data with several events of interest and would like to calculate the average duration of each event, with recurrence of the event being possible. I've tried a number of things but have spent several days being unsuccessful.

An example of the data set that I have is below, where study_id is the unique participant identifier, visit date is the date of the recurring visit, and event1 and event2 are the events of interest.

 study_id visitdate event1 event2 1001 1-Jan 0 1 1001 17-Jan 0 1 1001 12-Feb 1 0 1001 26-Feb 1 0 1001 4-Mar 1 0 1001 18-Mar 0 0 1002 1-Jan 0 0 1002 19-Jan 1 0 1002 30-Jan 1 0 1002 7-Feb 0 1 1002 19-Feb 0 1 1002 4-Mar 0 1 1003 1-Jan 1 0 1003 22-Jan 1 0 1003 10-Feb 1 0 1003 22-Feb 0 0 1003 3-Mar 0 1 1003 19-Mar 1 1 1003 2-Apr 1 1 1003 9-Apr 1 0 1003 27-Apr 1 1 1003 8-May 0 1

The data set I want is below, where e1_count is the number of events per study ID, e1_tot_dur is the total duration of all events (i.e. the sum of days of discreet events), and e1_ave_dur is the average duration of each event (variable of interest).

 Study ID e1_count e1_tot_dur e1_ave_dur e2_count e2_tot_dur e2_ave_dur 1001 1 20 20 1 16 16 1002 1 11 11 1 25 25 1003 2 79 39.5 2 41 20.5

Any advice would be much appreciated!

Jonathan

1 ACCEPTED SOLUTION

Accepted Solutions
Opal | Level 21

Re: Count number of events in longitudinal data

Simpler to do this with a long data organisation which can accomodate any number of events:

``````data have;
input study_id vd\$ event1 event2;
visitDate = input(cats(vd,"-2017"), anydtdte11.);
format visitDate yymmdd10.;
drop vd;
datalines;
1001 1-Jan 0 1
1001 17-Jan 0 1
1001 12-Feb 1 0
1001 26-Feb 1 0
1001 4-Mar 1 0
1001 18-Mar 0 0
1002 1-Jan 0 0
1002 19-Jan 1 0
1002 30-Jan 1 0
1002 7-Feb 0 1
1002 19-Feb 0 1
1002 4-Mar 0 1
1003 1-Jan 1 0
1003 22-Jan 1 0
1003 10-Feb 1 0
1003 22-Feb 0 0
1003 3-Mar 0 1
1003 19-Mar 1 1
1003 2-Apr 1 1
1003 9-Apr 1 0
1003 27-Apr 1 1
1003 8-May 0 1
;

data events;
set have(rename=event1=event) have(rename=event2=event in=in2);
eventNo = 1 + in2;
drop event1 event2;
run;

data want;
tot_dur = 0;
do until(last.study_id);
lastVisit = visitDate;
set events; by eventNo study_id;
if in then
if event then
if lastVisit > 0 then
tot_dur = sum(tot_dur, intck("day", lastVisit, visitDate));
else;
else in = 0;
else
if event then do;
in = 1;
count = sum(count, 1);
end;
end;
if count > 0 then ave_dur = tot_dur / count;
keep eventNo study_id tot_dur count ave_dur;
run;

proc print noobs; var eventNo study_id tot_dur count ave_dur; run;
``````

PG
3 REPLIES 3
Opal | Level 21

Re: Count number of events in longitudinal data

Simpler to do this with a long data organisation which can accomodate any number of events:

``````data have;
input study_id vd\$ event1 event2;
visitDate = input(cats(vd,"-2017"), anydtdte11.);
format visitDate yymmdd10.;
drop vd;
datalines;
1001 1-Jan 0 1
1001 17-Jan 0 1
1001 12-Feb 1 0
1001 26-Feb 1 0
1001 4-Mar 1 0
1001 18-Mar 0 0
1002 1-Jan 0 0
1002 19-Jan 1 0
1002 30-Jan 1 0
1002 7-Feb 0 1
1002 19-Feb 0 1
1002 4-Mar 0 1
1003 1-Jan 1 0
1003 22-Jan 1 0
1003 10-Feb 1 0
1003 22-Feb 0 0
1003 3-Mar 0 1
1003 19-Mar 1 1
1003 2-Apr 1 1
1003 9-Apr 1 0
1003 27-Apr 1 1
1003 8-May 0 1
;

data events;
set have(rename=event1=event) have(rename=event2=event in=in2);
eventNo = 1 + in2;
drop event1 event2;
run;

data want;
tot_dur = 0;
do until(last.study_id);
lastVisit = visitDate;
set events; by eventNo study_id;
if in then
if event then
if lastVisit > 0 then
tot_dur = sum(tot_dur, intck("day", lastVisit, visitDate));
else;
else in = 0;
else
if event then do;
in = 1;
count = sum(count, 1);
end;
end;
if count > 0 then ave_dur = tot_dur / count;
keep eventNo study_id tot_dur count ave_dur;
run;

proc print noobs; var eventNo study_id tot_dur count ave_dur; run;
``````

PG
Fluorite | Level 6

Re: Count number of events in longitudinal data

Incredible! Thank you so much for your assistance. You have no idea how much this has helped.

Super User

Re: Count number of events in longitudinal data

``````data have;
input study_id vd\$ event1 event2;
visitDate = input(cats(vd,"-2017"), anydtdte11.);
format visitDate yymmdd10.;
drop vd;
datalines;
1001 1-Jan 0 1
1001 17-Jan 0 1
1001 12-Feb 1 0
1001 26-Feb 1 0
1001 4-Mar 1 0
1001 18-Mar 0 0
1002 1-Jan 0 0
1002 19-Jan 1 0
1002 30-Jan 1 0
1002 7-Feb 0 1
1002 19-Feb 0 1
1002 4-Mar 0 1
1003 1-Jan 1 0
1003 22-Jan 1 0
1003 10-Feb 1 0
1003 22-Feb 0 0
1003 3-Mar 0 1
1003 19-Mar 1 1
1003 2-Apr 1 1
1003 9-Apr 1 0
1003 27-Apr 1 1
1003 8-May 0 1
;

data temp;
set have;
by study_id  event1 notsorted;
group+first.event1;
if event1=1;
run;
proc sql;
create table want1 as
select  study_id,count(distinct group) as e1_count,sum(range) as e1_tot_dur,
calculated e1_tot_dur/calculated e1_count as e1_ave_dur
from (select study_id,group,range(visitDate) as range from temp group by study_id,group)
group by study_id;
quit;``````
Discussion stats
• 3 replies
• 1910 views
• 1 like
• 3 in conversation