BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
yiyizhao86
Calcite | Level 5

Hi! I have the following data:

 

Original Data:

Type      Start_Date          End_Date

A          14Jul2002            23Jul2002

P          10Dec2002          09Dec2003

P          26Feb2003          09Dec2003

P          11Nov2003          10Jan2004

P          04May2004          24Aug2004

R          26Sep2007          19Feb2009

Final Result (combine the overlapping part and keep the non-over lapping part in different rows):

A          14Jul2002            23Jul2002

P          10Dec2002          10Jan2004

P          04May2004          24Aug2004

R          26Sep2007          19Feb2009

 

I have tried the following code:

data have;

     input type $ (start_date end_date) (:date9.);

     format start_date end_date date9.;

     cards;

A          14Jul2002            23Jul2002

P          10Dec2002          09Dec2003

P          26Feb2003          09Dec2003

P          11Nov2003          10Jan2004

P          04May2004          24Aug2004

R          26Sep2007          19Feb2009

R          26sep2007          01jan2010

;

proc sort data=have;

     by type start_date end_date;

run;

data want;

     do until (last.type);

           set have;

           by type;

           retain _start _end;

           format _start _end date9.;

           if first.type then

                do;

                     _start=start_date;

                     _end=end_date;

                end;

           if start_date > _end then

                do;

                     output;

                     _start=start_date;

                     _end=end_date;

                end;

           else  _end=max(_end, end_date);

     end;

     output;

     drop start_date end_date;

     rename _start=start_date _end=end_date;

run;

 

But the result I got was:

A 14Jul2002 23Jul2002

P 10Dec2002 24Aug2004

R 26Sep2007 01Jan2010

, which ignored the non-overlapping time period from 10Jan2004 to 04May2004

 

Could you please help me to fix the code? Thank you so much! I really appreciate it. 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

I used a slightly different approach because I didn't see the benefit of including a DOW loop. I think your WANT was wrong for the type R record. I think the range should be 26sep2007 thru 01Jan2010:

data have;
     input type $ (start_date end_date) (:date9.);
     format start_date end_date date9.;
     cards;
A          14Jul2002          23Jul2002
P          10Dec2002          09Dec2003
P          26Feb2003          09Dec2003
P          11Nov2003          10Jan2004
P          04May2004          24Aug2004
R          26Sep2007          19Feb2009
R          26sep2007          01jan2010
;
proc sort data=have;
     by type start_date end_date;
run;
data want (keep=type _start _end rename=(_start=start_date _end=end_date));
  set have;
  by type;
  retain _start _end;
  format _start _end date9.;
  if first.type then do;
    _start=start_date;
    _end=end_date;
  end;
  if start_date > _end then do;
    output;
    _start=start_date;
    _end=end_date;
  end;
  else  _end=max(_end, end_date);
  if last.type then output;
run;

Art, CEO, AnalystFinder.com

View solution in original post

3 REPLIES 3
art297
Opal | Level 21

I used a slightly different approach because I didn't see the benefit of including a DOW loop. I think your WANT was wrong for the type R record. I think the range should be 26sep2007 thru 01Jan2010:

data have;
     input type $ (start_date end_date) (:date9.);
     format start_date end_date date9.;
     cards;
A          14Jul2002          23Jul2002
P          10Dec2002          09Dec2003
P          26Feb2003          09Dec2003
P          11Nov2003          10Jan2004
P          04May2004          24Aug2004
R          26Sep2007          19Feb2009
R          26sep2007          01jan2010
;
proc sort data=have;
     by type start_date end_date;
run;
data want (keep=type _start _end rename=(_start=start_date _end=end_date));
  set have;
  by type;
  retain _start _end;
  format _start _end date9.;
  if first.type then do;
    _start=start_date;
    _end=end_date;
  end;
  if start_date > _end then do;
    output;
    _start=start_date;
    _end=end_date;
  end;
  else  _end=max(_end, end_date);
  if last.type then output;
run;

Art, CEO, AnalystFinder.com

yiyizhao86
Calcite | Level 5
Yes, I agree. Thank you so much for helping me out with the code!
SAS_inquisitive
Lapis Lazuli | Level 10
Also, if you are using Dow loop, you don't have to retain as it never exists out of implied loop until last.var.

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1115 views
  • 0 likes
  • 3 in conversation