BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
singhsahab
Lapis Lazuli | Level 10

Hi All,

 

request you to please help me to solve below problem.  

 

data have;
input SUBJECT START_DATE $9. END_DATE $10. seq;
cards;
3008 01OCT2020 01OCT2020 1
3008 02OCT2020 04OCT2020 2
3008 05OCT2020 06OCT2020 3
3008 08OCT2020 05OCT2020 4
3009 03OCT2020 03OCT2020 1
3009 04OCT2020 05OCT2020 2
3009 05OCT2020 05OCT2020 3 
3009 06OCT2020 07OCT2020 4
;
run;

May requirement is:

 

1. If First event has started and ended, after ending first event the second event must be start next to first ending date then min START_DATE and max END_DATE need to be consider in final output as First dataset.

2. If first event end date and second event start date has more then one day difference then it must be flagged. in second dataset.

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User
Please show what you expect as output, if this is the given input.

View solution in original post

3 REPLIES 3
Reeza
Super User
Please show what you expect as output, if this is the given input.
singhsahab
Lapis Lazuli | Level 10

@Reeza : Sorry by mistake i clicked on solution button.

 

here is my input and output requirement. 

data have;
input SUBJECT START_DATE $9. END_DATE $10. seq;
cards;
3008 01OCT2020 01OCT2020 1
3008 02OCT2020 04OCT2020 2
3008 05OCT2020 06OCT2020 3
3008 08OCT2020 10OCT2020 4
3009 03OCT2020 03OCT2020 1
3009 04OCT2020 05OCT2020 2
3009 05OCT2020 05OCT2020 3 
3009 06OCT2020 07OCT2020 4
3010 13OCT2020 15OCT2020 1
3010 17OCT2020 18OCT2020 2
;
run;


Output:

First Dataset:

SUBJECT START_DATE END_DATE
3008    01OCT2020  06OCT2020 
3009	03OCT2020  07OCT2020
3010	13OCT2020  15OCT2020 

Second Dataset:

SUBJECT START_DATE END_DATE
3008	 08OCT2020 10OCT2020 
3010	 17OCT2020 18OCT2020 

 Thank you 

Tom
Super User Tom
Super User

Sounds like you just want to collapse overlapping (or close to overlapping) periods into one period.  Not sure why you want two datasets out when one dataset can contain all of the periods with just a separate variable to indicate which period it is.

So first make sure your dates are actual date variables (and not strings like you example).

data have;
  input SUBJECT (START_DATE END_DATE) (:date.) seq;
  format start_date end_date date9.;
cards;
3008 01OCT2020 01OCT2020 1
3008 02OCT2020 04OCT2020 2
3008 05OCT2020 06OCT2020 3
3008 08OCT2020 10OCT2020 4
3009 03OCT2020 03OCT2020 1
3009 04OCT2020 05OCT2020 2
3009 05OCT2020 05OCT2020 3
3009 06OCT2020 07OCT2020 4
3010 13OCT2020 15OCT2020 1
3010 17OCT2020 18OCT2020 2
;

Then generate a PERIOD variable to flag the overlapping periods.  Adjust the IF condition if you want to allow different size gap between start and end to be considered part of the same large period.

Including START_DATE and END_DATE in the BY statement will cause the data step to insure the observations are sorted properly.  

data periods ;
  set have ;
  by subject start_date end_date ;
  lag_end = lag(end_date);
  if first.subject then period=1;
  else if start_date > lag_end+1 then period+1;
  drop lag_end;
run;

Now collapse to one observation per derived period.

data want;
  set periods ;
  by subject period;
  if first.period then start=start_date ;
  if last.period then do;
    start_date=start;
    output;
  end;
  retain start;
  drop start seq;
run;

Results:

                     START_
Obs    SUBJECT         DATE     END_DATE    period

 1       3008     01OCT2020    06OCT2020       1
 2       3008     08OCT2020    10OCT2020       2
 3       3009     03OCT2020    07OCT2020       1
 4       3010     13OCT2020    15OCT2020       1
 5       3010     17OCT2020    18OCT2020       2

 

Note if the original start/end date data is messier with overlapping or nested time periods you will need more complex logic to determine the overlapping periods.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 781 views
  • 1 like
  • 3 in conversation