BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
yiyizhao86
Calcite | Level 5

Hi! I have the following data:

 

Original Data:

Type      Start_Date          End_Date

A          14Jul2002            23Jul2002

P          10Dec2002          09Dec2003

P          26Feb2003          09Dec2003

P          11Nov2003          10Jan2004

P          04May2004          24Aug2004

R          26Sep2007          19Feb2009

Final Result (combine the overlapping part and keep the non-over lapping part in different rows):

A          14Jul2002            23Jul2002

P          10Dec2002          10Jan2004

P          04May2004          24Aug2004

R          26Sep2007          19Feb2009

 

I have tried the following code:

data have;

     input type $ (start_date end_date) (:date9.);

     format start_date end_date date9.;

     cards;

A          14Jul2002            23Jul2002

P          10Dec2002          09Dec2003

P          26Feb2003          09Dec2003

P          11Nov2003          10Jan2004

P          04May2004          24Aug2004

R          26Sep2007          19Feb2009

R          26sep2007          01jan2010

;

proc sort data=have;

     by type start_date end_date;

run;

data want;

     do until (last.type);

           set have;

           by type;

           retain _start _end;

           format _start _end date9.;

           if first.type then

                do;

                     _start=start_date;

                     _end=end_date;

                end;

           if start_date > _end then

                do;

                     output;

                     _start=start_date;

                     _end=end_date;

                end;

           else  _end=max(_end, end_date);

     end;

     output;

     drop start_date end_date;

     rename _start=start_date _end=end_date;

run;

 

But the result I got was:

A 14Jul2002 23Jul2002

P 10Dec2002 24Aug2004

R 26Sep2007 01Jan2010

, which ignored the non-overlapping time period from 10Jan2004 to 04May2004

 

Could you please help me to fix the code? Thank you so much! I really appreciate it. 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

I used a slightly different approach because I didn't see the benefit of including a DOW loop. I think your WANT was wrong for the type R record. I think the range should be 26sep2007 thru 01Jan2010:

data have;
     input type $ (start_date end_date) (:date9.);
     format start_date end_date date9.;
     cards;
A          14Jul2002          23Jul2002
P          10Dec2002          09Dec2003
P          26Feb2003          09Dec2003
P          11Nov2003          10Jan2004
P          04May2004          24Aug2004
R          26Sep2007          19Feb2009
R          26sep2007          01jan2010
;
proc sort data=have;
     by type start_date end_date;
run;
data want (keep=type _start _end rename=(_start=start_date _end=end_date));
  set have;
  by type;
  retain _start _end;
  format _start _end date9.;
  if first.type then do;
    _start=start_date;
    _end=end_date;
  end;
  if start_date > _end then do;
    output;
    _start=start_date;
    _end=end_date;
  end;
  else  _end=max(_end, end_date);
  if last.type then output;
run;

Art, CEO, AnalystFinder.com

View solution in original post

3 REPLIES 3
art297
Opal | Level 21

I used a slightly different approach because I didn't see the benefit of including a DOW loop. I think your WANT was wrong for the type R record. I think the range should be 26sep2007 thru 01Jan2010:

data have;
     input type $ (start_date end_date) (:date9.);
     format start_date end_date date9.;
     cards;
A          14Jul2002          23Jul2002
P          10Dec2002          09Dec2003
P          26Feb2003          09Dec2003
P          11Nov2003          10Jan2004
P          04May2004          24Aug2004
R          26Sep2007          19Feb2009
R          26sep2007          01jan2010
;
proc sort data=have;
     by type start_date end_date;
run;
data want (keep=type _start _end rename=(_start=start_date _end=end_date));
  set have;
  by type;
  retain _start _end;
  format _start _end date9.;
  if first.type then do;
    _start=start_date;
    _end=end_date;
  end;
  if start_date > _end then do;
    output;
    _start=start_date;
    _end=end_date;
  end;
  else  _end=max(_end, end_date);
  if last.type then output;
run;

Art, CEO, AnalystFinder.com

yiyizhao86
Calcite | Level 5
Yes, I agree. Thank you so much for helping me out with the code!
SAS_inquisitive
Lapis Lazuli | Level 10
Also, if you are using Dow loop, you don't have to retain as it never exists out of implied loop until last.var.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 588 views
  • 0 likes
  • 3 in conversation