BookmarkSubscribeRSS Feed
tomhead19
Calcite | Level 5

Hello,

 

I am trying to do what is described in the title. For example,  I am trying to determine how to make a continuous date and then delete the 2 duplicated to create 1 observation from the three. I have 3 observations, all sharing the same main variable (ID#). The other two variables, start date and stop date, should be continues if the time between those two dates is <2. So, like (fake dates) 02/28-03/11, 03/12-03/30. How can I code my programming to make it appear as 02/28-03/30 then delete the two duplicates. 

3 REPLIES 3
Tom
Super User Tom
Super User

What you are asking is how to collapse a time interval.  You want to collapse ranges where the GAP between them is smaller 3 days.

 

This is a common question on SAS communities.   For example see this recent thread.

 

https://communities.sas.com/t5/SAS-Programming/How-to-collapse-time-intervals-contingent-upon-some-a...

 

They did not allow a gap.  So just adjust the logic that is comparing the follow-up start dates to the previous end dates to allow your gap size.

 

Kurt_Bremser
Super User
data want;
set have;
set
  have (
    firstobs=2
    keep=id start_dt
    rename=(id=_id start_dt=_start)
  )
  have (
    obs=1
    keep=id start_dt
    rename=(id=-id start_dt=_start)
  )
;
retain _begin;
if _n_ = 1 then _begin = start_dt;
if id ne _id or _start gt stop_dt + 1
then do;
  start_dt = _begin;
  output;
  _begin = _start;
end;
drop _:;
run;

Untested, posted from my tablet.

mkeintz
PROC Star

So you have to (1) look back to see if the current obs is a new id, or follows a gap that is too large, and (2) look forward to see if the current obs is the last for a given id, or precedes a gap that is too large.

 

Untested, in the absence of sample data in the form of a working DATA step: 

%let max_gap=2;  /* Maximum allowable "hole" size */

data want (drop=_:);
  merge have
        have (firstobs=2 keep=id start_date rename=(id=_nxt_id start_date=_nxt_start));

  retain _initial_start;
  if id^=lag(id) or start-&max_gap > lag(stop_date) then _initial_start=start_date;

  if (id^=_nxt_id)  or  (_nxt_start > stop_date+&max_gap) ; /*Subsetting IF*/
  start_date=_initial_start;
run;

This code assumes that the data are grouped by ID, and are sorted chronologically within ID.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 376 views
  • 0 likes
  • 4 in conversation