BookmarkSubscribeRSS Feed
tomhead19
Calcite | Level 5

Hello,

 

I am trying to do what is described in the title. For example,  I am trying to determine how to make a continuous date and then delete the 2 duplicated to create 1 observation from the three. I have 3 observations, all sharing the same main variable (ID#). The other two variables, start date and stop date, should be continues if the time between those two dates is <2. So, like (fake dates) 02/28-03/11, 03/12-03/30. How can I code my programming to make it appear as 02/28-03/30 then delete the two duplicates. 

3 REPLIES 3
Tom
Super User Tom
Super User

What you are asking is how to collapse a time interval.  You want to collapse ranges where the GAP between them is smaller 3 days.

 

This is a common question on SAS communities.   For example see this recent thread.

 

https://communities.sas.com/t5/SAS-Programming/How-to-collapse-time-intervals-contingent-upon-some-a...

 

They did not allow a gap.  So just adjust the logic that is comparing the follow-up start dates to the previous end dates to allow your gap size.

 

Kurt_Bremser
Super User
data want;
set have;
set
  have (
    firstobs=2
    keep=id start_dt
    rename=(id=_id start_dt=_start)
  )
  have (
    obs=1
    keep=id start_dt
    rename=(id=-id start_dt=_start)
  )
;
retain _begin;
if _n_ = 1 then _begin = start_dt;
if id ne _id or _start gt stop_dt + 1
then do;
  start_dt = _begin;
  output;
  _begin = _start;
end;
drop _:;
run;

Untested, posted from my tablet.

mkeintz
PROC Star

So you have to (1) look back to see if the current obs is a new id, or follows a gap that is too large, and (2) look forward to see if the current obs is the last for a given id, or precedes a gap that is too large.

 

Untested, in the absence of sample data in the form of a working DATA step: 

%let max_gap=2;  /* Maximum allowable "hole" size */

data want (drop=_:);
  merge have
        have (firstobs=2 keep=id start_date rename=(id=_nxt_id start_date=_nxt_start));

  retain _initial_start;
  if id^=lag(id) or start-&max_gap > lag(stop_date) then _initial_start=start_date;

  if (id^=_nxt_id)  or  (_nxt_start > stop_date+&max_gap) ; /*Subsetting IF*/
  start_date=_initial_start;
run;

This code assumes that the data are grouped by ID, and are sorted chronologically within ID.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

sas-innovate-white.png

Special offer for SAS Communities members

Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.

 

View the full agenda.

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 801 views
  • 0 likes
  • 4 in conversation