Hi! I have the following data:
Original Data:
Type Start_Date End_Date
A 14Jul2002 23Jul2002
P 10Dec2002 09Dec2003
P 26Feb2003 09Dec2003
P 11Nov2003 10Jan2004
P 04May2004 24Aug2004
R 26Sep2007 19Feb2009
Final Result (combine the overlapping part and keep the non-over lapping part in different rows):
A 14Jul2002 23Jul2002
P 10Dec2002 10Jan2004
P 04May2004 24Aug2004
R 26Sep2007 19Feb2009
I have tried the following code:
data have;
input type $ (start_date end_date) (:date9.);
format start_date end_date date9.;
cards;
A 14Jul2002 23Jul2002
P 10Dec2002 09Dec2003
P 26Feb2003 09Dec2003
P 11Nov2003 10Jan2004
P 04May2004 24Aug2004
R 26Sep2007 19Feb2009
R 26sep2007 01jan2010
;
proc sort data=have;
by type start_date end_date;
run;
data want;
do until (last.type);
set have;
by type;
retain _start _end;
format _start _end date9.;
if first.type then
do;
_start=start_date;
_end=end_date;
end;
if start_date > _end then
do;
output;
_start=start_date;
_end=end_date;
end;
else _end=max(_end, end_date);
end;
output;
drop start_date end_date;
rename _start=start_date _end=end_date;
run;
But the result I got was:
A 14Jul2002 23Jul2002
P 10Dec2002 24Aug2004
R 26Sep2007 01Jan2010
, which ignored the non-overlapping time period from 10Jan2004 to 04May2004.
Could you please help me to fix the code? Thank you so much! I really appreciate it.
I used a slightly different approach because I didn't see the benefit of including a DOW loop. I think your WANT was wrong for the type R record. I think the range should be 26sep2007 thru 01Jan2010:
data have; input type $ (start_date end_date) (:date9.); format start_date end_date date9.; cards; A 14Jul2002 23Jul2002 P 10Dec2002 09Dec2003 P 26Feb2003 09Dec2003 P 11Nov2003 10Jan2004 P 04May2004 24Aug2004 R 26Sep2007 19Feb2009 R 26sep2007 01jan2010 ; proc sort data=have; by type start_date end_date; run; data want (keep=type _start _end rename=(_start=start_date _end=end_date)); set have; by type; retain _start _end; format _start _end date9.; if first.type then do; _start=start_date; _end=end_date; end; if start_date > _end then do; output; _start=start_date; _end=end_date; end; else _end=max(_end, end_date); if last.type then output; run;
Art, CEO, AnalystFinder.com
I used a slightly different approach because I didn't see the benefit of including a DOW loop. I think your WANT was wrong for the type R record. I think the range should be 26sep2007 thru 01Jan2010:
data have; input type $ (start_date end_date) (:date9.); format start_date end_date date9.; cards; A 14Jul2002 23Jul2002 P 10Dec2002 09Dec2003 P 26Feb2003 09Dec2003 P 11Nov2003 10Jan2004 P 04May2004 24Aug2004 R 26Sep2007 19Feb2009 R 26sep2007 01jan2010 ; proc sort data=have; by type start_date end_date; run; data want (keep=type _start _end rename=(_start=start_date _end=end_date)); set have; by type; retain _start _end; format _start _end date9.; if first.type then do; _start=start_date; _end=end_date; end; if start_date > _end then do; output; _start=start_date; _end=end_date; end; else _end=max(_end, end_date); if last.type then output; run;
Art, CEO, AnalystFinder.com
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.