Solved: Re: Count number of events in longitudinal data

jpsmith · Posted 11-20-2017 06:54 PM

Hello,

I am working with longitudinal data with several events of interest and would like to calculate the average duration of each event, with recurrence of the event being possible. I've tried a number of things but have spent several days being unsuccessful.

An example of the data set that I have is below, where study_id is the unique participant identifier, visit date is the date of the recurring visit, and event1 and event2 are the events of interest.

study_id	visitdate	event1	event2
1001	1-Jan	0	1
1001	17-Jan	0	1
1001	12-Feb	1	0
1001	26-Feb	1	0
1001	4-Mar	1	0
1001	18-Mar	0	0
1002	1-Jan	0	0
1002	19-Jan	1	0
1002	30-Jan	1	0
1002	7-Feb	0	1
1002	19-Feb	0	1
1002	4-Mar	0	1
1003	1-Jan	1	0
1003	22-Jan	1	0
1003	10-Feb	1	0
1003	22-Feb	0	0
1003	3-Mar	0	1
1003	19-Mar	1	1
1003	2-Apr	1	1
1003	9-Apr	1	0
1003	27-Apr	1	1
1003	8-May	0	1

The data set I want is below, where e1_count is the number of events per study ID, e1_tot_dur is the total duration of all events (i.e. the sum of days of discreet events), and e1_ave_dur is the average duration of each event (variable of interest).

Study ID	e1_count	e1_tot_dur	e1_ave_dur	e2_count	e2_tot_dur	e2_ave_dur
1001	1	20	20	1	16	16
1002	1	11	11	1	25	25
1003	2	79	39.5	2	41	20.5

Any advice would be much appreciated!

Jonathan

PGStats · Posted 11-20-2017 11:03 PM

Simpler to do this with a long data organisation which can accomodate any number of events:

data have;
input study_id vd$ event1 event2;
visitDate = input(cats(vd,"-2017"), anydtdte11.);
format visitDate yymmdd10.;
drop vd;
datalines; 
1001 1-Jan 0 1 
1001 17-Jan 0 1 
1001 12-Feb 1 0 
1001 26-Feb 1 0 
1001 4-Mar 1 0 
1001 18-Mar 0 0 
1002 1-Jan 0 0 
1002 19-Jan 1 0 
1002 30-Jan 1 0 
1002 7-Feb 0 1 
1002 19-Feb 0 1 
1002 4-Mar 0 1 
1003 1-Jan 1 0 
1003 22-Jan 1 0 
1003 10-Feb 1 0 
1003 22-Feb 0 0 
1003 3-Mar 0 1 
1003 19-Mar 1 1 
1003 2-Apr 1 1 
1003 9-Apr 1 0 
1003 27-Apr 1 1 
1003 8-May 0 1 
;

data events;
set have(rename=event1=event) have(rename=event2=event in=in2);
eventNo = 1 + in2;
drop event1 event2;
run;


data want;
tot_dur = 0;
do until(last.study_id);
    lastVisit = visitDate;
    set events; by eventNo study_id;
    if in then 
        if event then 
            if lastVisit > 0 then 
                tot_dur = sum(tot_dur, intck("day", lastVisit, visitDate));
            else;
        else in = 0;
    else
        if event then do;
            in = 1;
            count = sum(count, 1);
            end;
    end;
if count > 0 then ave_dur = tot_dur / count;
keep eventNo study_id tot_dur count ave_dur;
run;
            
proc print noobs; var eventNo study_id tot_dur count ave_dur; run;

PG

View solution in original post

PGStats · Posted 11-20-2017 11:03 PM

Simpler to do this with a long data organisation which can accomodate any number of events:

data have;
input study_id vd$ event1 event2;
visitDate = input(cats(vd,"-2017"), anydtdte11.);
format visitDate yymmdd10.;
drop vd;
datalines; 
1001 1-Jan 0 1 
1001 17-Jan 0 1 
1001 12-Feb 1 0 
1001 26-Feb 1 0 
1001 4-Mar 1 0 
1001 18-Mar 0 0 
1002 1-Jan 0 0 
1002 19-Jan 1 0 
1002 30-Jan 1 0 
1002 7-Feb 0 1 
1002 19-Feb 0 1 
1002 4-Mar 0 1 
1003 1-Jan 1 0 
1003 22-Jan 1 0 
1003 10-Feb 1 0 
1003 22-Feb 0 0 
1003 3-Mar 0 1 
1003 19-Mar 1 1 
1003 2-Apr 1 1 
1003 9-Apr 1 0 
1003 27-Apr 1 1 
1003 8-May 0 1 
;

data events;
set have(rename=event1=event) have(rename=event2=event in=in2);
eventNo = 1 + in2;
drop event1 event2;
run;


data want;
tot_dur = 0;
do until(last.study_id);
    lastVisit = visitDate;
    set events; by eventNo study_id;
    if in then 
        if event then 
            if lastVisit > 0 then 
                tot_dur = sum(tot_dur, intck("day", lastVisit, visitDate));
            else;
        else in = 0;
    else
        if event then do;
            in = 1;
            count = sum(count, 1);
            end;
    end;
if count > 0 then ave_dur = tot_dur / count;
keep eventNo study_id tot_dur count ave_dur;
run;
            
proc print noobs; var eventNo study_id tot_dur count ave_dur; run;

PG

jpsmith · Posted 11-20-2017 11:12 PM

Incredible! Thank you so much for your assistance. You have no idea how much this has helped.

Ksharp · Posted 11-21-2017 08:29 AM

data have;
input study_id vd$ event1 event2;
visitDate = input(cats(vd,"-2017"), anydtdte11.);
format visitDate yymmdd10.;
drop vd;
datalines; 
1001 1-Jan 0 1 
1001 17-Jan 0 1 
1001 12-Feb 1 0 
1001 26-Feb 1 0 
1001 4-Mar 1 0 
1001 18-Mar 0 0 
1002 1-Jan 0 0 
1002 19-Jan 1 0 
1002 30-Jan 1 0 
1002 7-Feb 0 1 
1002 19-Feb 0 1 
1002 4-Mar 0 1 
1003 1-Jan 1 0 
1003 22-Jan 1 0 
1003 10-Feb 1 0 
1003 22-Feb 0 0 
1003 3-Mar 0 1 
1003 19-Mar 1 1 
1003 2-Apr 1 1 
1003 9-Apr 1 0 
1003 27-Apr 1 1 
1003 8-May 0 1 
;

data temp;
 set have;
 by study_id  event1 notsorted;
 group+first.event1;
 if event1=1;
run;
proc sql;
create table want1 as
 select  study_id,count(distinct group) as e1_count,sum(range) as e1_tot_dur,
  calculated e1_tot_dur/calculated e1_count as e1_ave_dur
  from (select study_id,group,range(visitDate) as range from temp group by study_id,group)
   group by study_id;
quit;

Count number of events in longitudinal data

Re: Count number of events in longitudinal data

Re: Count number of events in longitudinal data

Re: Count number of events in longitudinal data

Re: Count number of events in longitudinal data

SAS Innovate 2025: Register Now

SAS Training: Just a Click Away