Hi! I have the following data:
Original Data:
Type Start_Date End_Date
A 14Jul2002 23Jul2002
P 10Dec2002 09Dec2003
P 26Feb2003 09Dec2003
P 11Nov2003 10Jan2004
P 04May2004 24Aug2004
R 26Sep2007 19Feb2009
Final Result (combine the overlapping part and keep the non-over lapping part in different rows):
A 14Jul2002 23Jul2002
P 10Dec2002 10Jan2004
P 04May2004 24Aug2004
R 26Sep2007 19Feb2009
I have tried the following code:
data have;
input type $ (start_date end_date) (:date9.);
format start_date end_date date9.;
cards;
A 14Jul2002 23Jul2002
P 10Dec2002 09Dec2003
P 26Feb2003 09Dec2003
P 11Nov2003 10Jan2004
P 04May2004 24Aug2004
R 26Sep2007 19Feb2009
R 26sep2007 01jan2010
;
proc sort data=have;
by type start_date end_date;
run;
data want;
do until (last.type);
set have;
by type;
retain _start _end;
format _start _end date9.;
if first.type then
do;
_start=start_date;
_end=end_date;
end;
if start_date > _end then
do;
output;
_start=start_date;
_end=end_date;
end;
else _end=max(_end, end_date);
end;
output;
drop start_date end_date;
rename _start=start_date _end=end_date;
run;
But the result I got was:
A 14Jul2002 23Jul2002
P 10Dec2002 24Aug2004
R 26Sep2007 01Jan2010
, which ignored the non-overlapping time period from 10Jan2004 to 04May2004.
Could you please help me to fix the code? Thank you so much! I really appreciate it.
I used a slightly different approach because I didn't see the benefit of including a DOW loop. I think your WANT was wrong for the type R record. I think the range should be 26sep2007 thru 01Jan2010:
data have; input type $ (start_date end_date) (:date9.); format start_date end_date date9.; cards; A 14Jul2002 23Jul2002 P 10Dec2002 09Dec2003 P 26Feb2003 09Dec2003 P 11Nov2003 10Jan2004 P 04May2004 24Aug2004 R 26Sep2007 19Feb2009 R 26sep2007 01jan2010 ; proc sort data=have; by type start_date end_date; run; data want (keep=type _start _end rename=(_start=start_date _end=end_date)); set have; by type; retain _start _end; format _start _end date9.; if first.type then do; _start=start_date; _end=end_date; end; if start_date > _end then do; output; _start=start_date; _end=end_date; end; else _end=max(_end, end_date); if last.type then output; run;
Art, CEO, AnalystFinder.com
I used a slightly different approach because I didn't see the benefit of including a DOW loop. I think your WANT was wrong for the type R record. I think the range should be 26sep2007 thru 01Jan2010:
data have; input type $ (start_date end_date) (:date9.); format start_date end_date date9.; cards; A 14Jul2002 23Jul2002 P 10Dec2002 09Dec2003 P 26Feb2003 09Dec2003 P 11Nov2003 10Jan2004 P 04May2004 24Aug2004 R 26Sep2007 19Feb2009 R 26sep2007 01jan2010 ; proc sort data=have; by type start_date end_date; run; data want (keep=type _start _end rename=(_start=start_date _end=end_date)); set have; by type; retain _start _end; format _start _end date9.; if first.type then do; _start=start_date; _end=end_date; end; if start_date > _end then do; output; _start=start_date; _end=end_date; end; else _end=max(_end, end_date); if last.type then output; run;
Art, CEO, AnalystFinder.com
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.