BookmarkSubscribeRSS Feed
tomhead19
Calcite | Level 5

Hello,

 

I am trying to do what is described in the title. For example,  I am trying to determine how to make a continuous date and then delete the 2 duplicated to create 1 observation from the three. I have 3 observations, all sharing the same main variable (ID#). The other two variables, start date and stop date, should be continues if the time between those two dates is <2. So, like (fake dates) 02/28-03/11, 03/12-03/30. How can I code my programming to make it appear as 02/28-03/30 then delete the two duplicates. 

3 REPLIES 3
Tom
Super User Tom
Super User

What you are asking is how to collapse a time interval.  You want to collapse ranges where the GAP between them is smaller 3 days.

 

This is a common question on SAS communities.   For example see this recent thread.

 

https://communities.sas.com/t5/SAS-Programming/How-to-collapse-time-intervals-contingent-upon-some-a...

 

They did not allow a gap.  So just adjust the logic that is comparing the follow-up start dates to the previous end dates to allow your gap size.

 

Kurt_Bremser
Super User
data want;
set have;
set
  have (
    firstobs=2
    keep=id start_dt
    rename=(id=_id start_dt=_start)
  )
  have (
    obs=1
    keep=id start_dt
    rename=(id=-id start_dt=_start)
  )
;
retain _begin;
if _n_ = 1 then _begin = start_dt;
if id ne _id or _start gt stop_dt + 1
then do;
  start_dt = _begin;
  output;
  _begin = _start;
end;
drop _:;
run;

Untested, posted from my tablet.

mkeintz
PROC Star

So you have to (1) look back to see if the current obs is a new id, or follows a gap that is too large, and (2) look forward to see if the current obs is the last for a given id, or precedes a gap that is too large.

 

Untested, in the absence of sample data in the form of a working DATA step: 

%let max_gap=2;  /* Maximum allowable "hole" size */

data want (drop=_:);
  merge have
        have (firstobs=2 keep=id start_date rename=(id=_nxt_id start_date=_nxt_start));

  retain _initial_start;
  if id^=lag(id) or start-&max_gap > lag(stop_date) then _initial_start=start_date;

  if (id^=_nxt_id)  or  (_nxt_start > stop_date+&max_gap) ; /*Subsetting IF*/
  start_date=_initial_start;
run;

This code assumes that the data are grouped by ID, and are sorted chronologically within ID.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 999 views
  • 0 likes
  • 4 in conversation