BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
yiyizhao86
Calcite | Level 5

Hi! I have the following data:

 

Original Data:

Type      Start_Date          End_Date

A          14Jul2002            23Jul2002

P          10Dec2002          09Dec2003

P          26Feb2003          09Dec2003

P          11Nov2003          10Jan2004

P          04May2004          24Aug2004

R          26Sep2007          19Feb2009

Final Result (combine the overlapping part and keep the non-over lapping part in different rows):

A          14Jul2002            23Jul2002

P          10Dec2002          10Jan2004

P          04May2004          24Aug2004

R          26Sep2007          19Feb2009

 

I have tried the following code:

data have;

     input type $ (start_date end_date) (:date9.);

     format start_date end_date date9.;

     cards;

A          14Jul2002            23Jul2002

P          10Dec2002          09Dec2003

P          26Feb2003          09Dec2003

P          11Nov2003          10Jan2004

P          04May2004          24Aug2004

R          26Sep2007          19Feb2009

R          26sep2007          01jan2010

;

proc sort data=have;

     by type start_date end_date;

run;

data want;

     do until (last.type);

           set have;

           by type;

           retain _start _end;

           format _start _end date9.;

           if first.type then

                do;

                     _start=start_date;

                     _end=end_date;

                end;

           if start_date > _end then

                do;

                     output;

                     _start=start_date;

                     _end=end_date;

                end;

           else  _end=max(_end, end_date);

     end;

     output;

     drop start_date end_date;

     rename _start=start_date _end=end_date;

run;

 

But the result I got was:

A 14Jul2002 23Jul2002

P 10Dec2002 24Aug2004

R 26Sep2007 01Jan2010

, which ignored the non-overlapping time period from 10Jan2004 to 04May2004

 

Could you please help me to fix the code? Thank you so much! I really appreciate it. 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

I used a slightly different approach because I didn't see the benefit of including a DOW loop. I think your WANT was wrong for the type R record. I think the range should be 26sep2007 thru 01Jan2010:

data have;
     input type $ (start_date end_date) (:date9.);
     format start_date end_date date9.;
     cards;
A          14Jul2002          23Jul2002
P          10Dec2002          09Dec2003
P          26Feb2003          09Dec2003
P          11Nov2003          10Jan2004
P          04May2004          24Aug2004
R          26Sep2007          19Feb2009
R          26sep2007          01jan2010
;
proc sort data=have;
     by type start_date end_date;
run;
data want (keep=type _start _end rename=(_start=start_date _end=end_date));
  set have;
  by type;
  retain _start _end;
  format _start _end date9.;
  if first.type then do;
    _start=start_date;
    _end=end_date;
  end;
  if start_date > _end then do;
    output;
    _start=start_date;
    _end=end_date;
  end;
  else  _end=max(_end, end_date);
  if last.type then output;
run;

Art, CEO, AnalystFinder.com

View solution in original post

3 REPLIES 3
art297
Opal | Level 21

I used a slightly different approach because I didn't see the benefit of including a DOW loop. I think your WANT was wrong for the type R record. I think the range should be 26sep2007 thru 01Jan2010:

data have;
     input type $ (start_date end_date) (:date9.);
     format start_date end_date date9.;
     cards;
A          14Jul2002          23Jul2002
P          10Dec2002          09Dec2003
P          26Feb2003          09Dec2003
P          11Nov2003          10Jan2004
P          04May2004          24Aug2004
R          26Sep2007          19Feb2009
R          26sep2007          01jan2010
;
proc sort data=have;
     by type start_date end_date;
run;
data want (keep=type _start _end rename=(_start=start_date _end=end_date));
  set have;
  by type;
  retain _start _end;
  format _start _end date9.;
  if first.type then do;
    _start=start_date;
    _end=end_date;
  end;
  if start_date > _end then do;
    output;
    _start=start_date;
    _end=end_date;
  end;
  else  _end=max(_end, end_date);
  if last.type then output;
run;

Art, CEO, AnalystFinder.com

yiyizhao86
Calcite | Level 5
Yes, I agree. Thank you so much for helping me out with the code!
SAS_inquisitive
Lapis Lazuli | Level 10
Also, if you are using Dow loop, you don't have to retain as it never exists out of implied loop until last.var.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1138 views
  • 0 likes
  • 3 in conversation