BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ChristyN
Fluorite | Level 6

Hello, I have the following data:

Type      Start_Date          End_Date

A          14Jul2002            23Jul2002

P          10Dec2002          09Dec2003

P          26Feb2003          09Dec2003

P          04May2004          24Aug2004

R          26Sep2007          19Feb2009

Observations 2 and 3 are the same type and have overlapping date ranges. How can I identify them as 'related' based on type and overlapping date ranges?

Thanks in advance Smiley Happy

1 ACCEPTED SOLUTION

Accepted Solutions
Haikuo
Onyx | Level 15

Another alternative:

data have;

     input type $ (start_date end_date) (:date9.);

     format start_date end_date date9.;

     cards;

A          14Jul2002            23Jul2002

P          10Dec2002          09Dec2003

P          26Feb2003          09Dec2003

P          11Nov2003          10Jan2004

P          04May2004          24Aug2004

R          26Sep2007          19Feb2009

R          26sep2007          01jan2010

;

proc sort data=have;

     by type start_date end_date;

run;

data want;

     do until (last.type);

           set have;

           by type;

           retain _start _end;

           format _start _end date9.;

           if first.type then

                do;

                     _start=start_date;

                     _end=end_date;

                end;

           if start_date > _end then

                do;

                     output;

                     _start=start_date;

                     _end=end_date;

                end;

           else  _end=max(_end, end_date);

     end;

     output;

     drop start_date end_date;

     rename _start=start_date _end=end_date;

run;

View solution in original post

6 REPLIES 6
ballardw
Super User

What do you expect the output data to look like?

The most likely tool involves either LAG or DIF functions in a datastep.

What would the output look like if there were 3 or more sequential over laps such as

P  11Nov2003  10Jan2004

Is your data sorted by type and start_date?

ChristyN
Fluorite | Level 6

At the end of the day, I would want 'overlapping' observations to appear as just one using the earliest start date and the latest end date.

If I add in your additional observation:

Original Data:

Type      Start_Date          End_Date

A          14Jul2002            23Jul2002

P          10Dec2002          09Dec2003

P          26Feb2003          09Dec2003

P          11Nov2003          10Jan2004

P          04May2004          24Aug2004

R          26Sep2007          19Feb2009

Final Result:

A          14Jul2002            23Jul2002

P          10Dec2002          10Jan2004

P          04May2004          24Aug2004

R          26Sep2007          19Feb2009

And yes it is sorted by type and start_date.

Thank-you.

art297
Opal | Level 21

Christy: I think that the following does what you want to accomplish:

data have;

input type $ (start_date end_date) (:date9.);

format start_date end_date date9.;

cards;

A          14Jul2002            23Jul2002

P          10Dec2002          09Dec2003

P          26Feb2003          09Dec2003

P          04May2004          24Aug2004

R          26Sep2007          19Feb2009

;

data want;

   set have (rename=(start_date=s_dt end_date=e_dt));

     by type;

  retain start_date end_date;

  format start_date end_date date9.;

      if first.type then do;

        start_date=s_dt;

        end_date=e_dt;

      end;

      if s_dt <= lag(e_dt) then end_date=e_dt;

   else if not first.type then do;

     output;

     start_date=s_dt;

     end_date=e_dt;

   end;

   if last.type then output;

   drop s_dt e_dt;

run;

Haikuo
Onyx | Level 15

Another alternative:

data have;

     input type $ (start_date end_date) (:date9.);

     format start_date end_date date9.;

     cards;

A          14Jul2002            23Jul2002

P          10Dec2002          09Dec2003

P          26Feb2003          09Dec2003

P          11Nov2003          10Jan2004

P          04May2004          24Aug2004

R          26Sep2007          19Feb2009

R          26sep2007          01jan2010

;

proc sort data=have;

     by type start_date end_date;

run;

data want;

     do until (last.type);

           set have;

           by type;

           retain _start _end;

           format _start _end date9.;

           if first.type then

                do;

                     _start=start_date;

                     _end=end_date;

                end;

           if start_date > _end then

                do;

                     output;

                     _start=start_date;

                     _end=end_date;

                end;

           else  _end=max(_end, end_date);

     end;

     output;

     drop start_date end_date;

     rename _start=start_date _end=end_date;

run;

ChristyN
Fluorite | Level 6

Thank you very much for the responses! I was out of the office yesterday but will be trying out your suggestions today. I really appreciate the input.

Cheers!

yiyizhao86
Calcite | Level 5

Thank you so much for the code! But then I was testing out the code, the result I got was:

 

A 14Jul2002 23Jul2002

P 10Dec2002 24Aug2004

R 26Sep2007 01Jan2010

 

but the actual result that I want is 

 

A 14Jul2002 23Jul2002

P 10Dec2002  01Jan2004

P 04May2004 24Aug2004

R 26Sep2007 01Jan2010

 

 

Since there is no overlap between 01Jan 2004 and 04May 2004, I would like to show this gap in the final result. Will you mind letting me know how to fix this in the code? Thank you so much!

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 7108 views
  • 0 likes
  • 5 in conversation