Hello, I have the following data:
Type Start_Date End_Date
A 14Jul2002 23Jul2002
P 10Dec2002 09Dec2003
P 26Feb2003 09Dec2003
P 04May2004 24Aug2004
R 26Sep2007 19Feb2009
Observations 2 and 3 are the same type and have overlapping date ranges. How can I identify them as 'related' based on type and overlapping date ranges?
Thanks in advance
Another alternative:
data have;
input type $ (start_date end_date) (:date9.);
format start_date end_date date9.;
cards;
A 14Jul2002 23Jul2002
P 10Dec2002 09Dec2003
P 26Feb2003 09Dec2003
P 11Nov2003 10Jan2004
P 04May2004 24Aug2004
R 26Sep2007 19Feb2009
R 26sep2007 01jan2010
;
proc sort data=have;
by type start_date end_date;
run;
data want;
do until (last.type);
set have;
by type;
retain _start _end;
format _start _end date9.;
if first.type then
do;
_start=start_date;
_end=end_date;
end;
if start_date > _end then
do;
output;
_start=start_date;
_end=end_date;
end;
else _end=max(_end, end_date);
end;
output;
drop start_date end_date;
rename _start=start_date _end=end_date;
run;
What do you expect the output data to look like?
The most likely tool involves either LAG or DIF functions in a datastep.
What would the output look like if there were 3 or more sequential over laps such as
P 11Nov2003 10Jan2004
Is your data sorted by type and start_date?
At the end of the day, I would want 'overlapping' observations to appear as just one using the earliest start date and the latest end date.
If I add in your additional observation:
Original Data:
Type Start_Date End_Date
A 14Jul2002 23Jul2002
P 10Dec2002 09Dec2003
P 26Feb2003 09Dec2003
P 11Nov2003 10Jan2004
P 04May2004 24Aug2004
R 26Sep2007 19Feb2009
Final Result:
A 14Jul2002 23Jul2002
P 10Dec2002 10Jan2004
P 04May2004 24Aug2004
R 26Sep2007 19Feb2009
And yes it is sorted by type and start_date.
Thank-you.
Christy: I think that the following does what you want to accomplish:
data have;
input type $ (start_date end_date) (:date9.);
format start_date end_date date9.;
cards;
A 14Jul2002 23Jul2002
P 10Dec2002 09Dec2003
P 26Feb2003 09Dec2003
P 04May2004 24Aug2004
R 26Sep2007 19Feb2009
;
data want;
set have (rename=(start_date=s_dt end_date=e_dt));
by type;
retain start_date end_date;
format start_date end_date date9.;
if first.type then do;
start_date=s_dt;
end_date=e_dt;
end;
if s_dt <= lag(e_dt) then end_date=e_dt;
else if not first.type then do;
output;
start_date=s_dt;
end_date=e_dt;
end;
if last.type then output;
drop s_dt e_dt;
run;
Another alternative:
data have;
input type $ (start_date end_date) (:date9.);
format start_date end_date date9.;
cards;
A 14Jul2002 23Jul2002
P 10Dec2002 09Dec2003
P 26Feb2003 09Dec2003
P 11Nov2003 10Jan2004
P 04May2004 24Aug2004
R 26Sep2007 19Feb2009
R 26sep2007 01jan2010
;
proc sort data=have;
by type start_date end_date;
run;
data want;
do until (last.type);
set have;
by type;
retain _start _end;
format _start _end date9.;
if first.type then
do;
_start=start_date;
_end=end_date;
end;
if start_date > _end then
do;
output;
_start=start_date;
_end=end_date;
end;
else _end=max(_end, end_date);
end;
output;
drop start_date end_date;
rename _start=start_date _end=end_date;
run;
Thank you very much for the responses! I was out of the office yesterday but will be trying out your suggestions today. I really appreciate the input.
Cheers!
Thank you so much for the code! But then I was testing out the code, the result I got was:
A 14Jul2002 23Jul2002
P 10Dec2002 24Aug2004
R 26Sep2007 01Jan2010
but the actual result that I want is
A 14Jul2002 23Jul2002
P 10Dec2002 01Jan2004
P 04May2004 24Aug2004
R 26Sep2007 01Jan2010
Since there is no overlap between 01Jan 2004 and 04May 2004, I would like to show this gap in the final result. Will you mind letting me know how to fix this in the code? Thank you so much!
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.