Hello Community Members,
I have an event episode data file for children involved in child welfare. The variables include CHILDID, event TYPE (3 categories: investigation INV, service, SRV, out of home placement PLC), event START date, event STOP date, and the event sequence number (EVENT_NUM). The events are sorted based on start and stop dates for each child. However, there are considerable overlapping of the events event for the same event type. I would like to collapse the overlapping or connecting events of the same type into one event with the earliest start and latest stop dates of all related events. I have figured out how to do it, but the SAS codes seem too complex. I have included the SAS codes together with some sample events (for two children). Please let me know if you find any problem or you think there are more efficient ways to do it. Thanks,
** Identify overlapping events and collapse overlapping and/or connected events of same type;
data one;
input childid type $ (start stop) (: mmddyy12.) event_num;
format start stop mmddyy10.;
cards;
1 inv 03/01/2015 03/10/2015 1
1 srv 03/10/2015 05/01/2015 2
1 inv 06/01/2015 06/30/2015 3
1 srv 06/20/2015 06/30/2015 4
1 srv 01/01/2016 03/01/2016 5
1 srv 02/01/2016 05/01/2016 6
1 plc 06/01/2016 10/04/2016 7
1 inv 10/01/2016 10/10/2016 8
1 inv 10/11/2016 10/20/2016 9
1 plc 10/21/2016 05/01/2017 10
2 inv 12/01/2015 12/20/2015 1
2 inv 12/21/2015 12/30/2015 2
2 srv 12/25/2015 06/01/2016 3
2 srv 03/01/2016 05/01/2016 4
2 srv 06/02/2016 10/05/2016 5
2 srv 12/01/2016 12/30/2016 6
2 plc 12/30/2016 04/04/2017 7
;
run;
** Sort by childid, start and stop dates;
proc sort data=one;
by childid start stop type;
run;
** Purpose: collapse same type events that overlap (id1, num 5,6), connect (id 1 num 8,9 / id2 , num 1,2), or;
** envelop & connect each other (id 2,, num 3,4,5) as One single event, using earliest start date and latest end date;
** Like the following, with events of overlapping dates collpased;
/*
1 inv 03/01/2015 03/10/2015 1
1 srv 03/10/2015 05/01/2015 2
1 inv 06/01/2015 06/30/2015 3
1 srv 06/20/2015 06/30/2015 4
1 srv 01/01/2016 05/01/2016 5+6
1 plc 06/01/2016 10/04/2016 7
1 inv 10/01/2016 10/20/2016 8+9
1 plc 10/21/2016 05/01/2017 10
2 inv 12/01/2015 12/30/2015 1+2
2 srv 12/25/2015 10/05/2016 3+4+5
2 srv 12/01/2016 12/30/2016 6
2 plc 12/30/2016 04/04/2017 7
*/
data two;
set one;
by childid start stop type;
retain istart istop itype counter;
newstart=start;
newstop=stop;
if first.childid then do;
istart=start;
istop=stop;
itype=type;
counter=0;
end;
counter=counter+1;
if not first.childid then do;
if type ne itype then do;
itype=type;
istart=start;
istop=stop;
end;
else if type eq itype then do;
if start gt (istop+1) then do;
* check=0;
istart=start;
istop=stop;
end;
else if start le (istop+1) then do;
check=1;
newstart=istart;
if stop le istop then newstop=istop; /*for nested event*/
if stop gt istop then istop=stop;
end;
end;
end;
format istart istop newstart newstop mmddyy10.;
run;
proc sort data=two;
by childid itype newstart descending newstop;
run;
proc sort data=two out=three nodupkey;
by childid itype newstart; * descending newstop;
run;
proc sort data=three;
by childid counter;
run;
... View more