BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
WAL83
Obsidian | Level 7

Hello,

 

I'm trying to determine the number of patients in my dataset that have at least X contiguos paired records by date (DAY & NIGHT RECORD=PAIR). I need to delete records that do not have a PAIR and then create "PAIR#" value to flag contiguous records for the patient. PAIR# = 3 would indicate this is the 3rd contiguous PAIR of records for this patient. Start "PAIR#" over again at 1 if there are non-contiguous paired date records for a patient, in other words PAIR# does not have to be unique for a patient. See below example dataset.

 

ID PAIR# Start Date Start Time Value Delete?
1   12-Feb-14 6:00:00 DAY DELETE
1 1 12-Feb-14 6:01:00 DAY KEEP
1 1 12-Feb-14 22:00:00 NIGHT KEEP
1 2 13-Feb-14 6:00:00 DAY KEEP
1 2 13-Feb-14 22:00:00 NIGHT KEEP
1 3 14-Feb-14 6:00:00 DAY KEEP
1 3 14-Feb-14 22:00:00 NIGHT KEEP
3   16-Jan-13 22:00:00 NIGHT DELETE
3 1 17-Jan-13 6:00:00 DAY KEEP
3 1 17-Jan-13 22:00:00 NIGHT KEEP
3   18-Jan-13 22:00:00 NIGHT DELETE
3   21-Jan-13 22:00:00 NIGHT DELETE
3 1 22-Jan-13 6:00:00 DAY KEEP
3 1 22-Jan-13 22:00:00 NIGHT KEEP
3 2 23-Jan-13 6:00:00 DAY KEEP
3 2 23-Jan-13 22:00:00 NIGHT KEEP
4   9-Jan-13 22:00:00 NIGHT DELETE
4   10-Jan-13 22:00:00 NIGHT DELETE
4 1 11-Jan-13 6:01:00 DAY KEEP
4 1 11-Jan-13 22:00:00 NIGHT KEEP
4 2 12-Jan-13 6:01:00 DAY KEEP
4 2 12-Jan-13 22:00:00 NIGHT KEEP
5 1 20-Feb-13 22:00:00 NIGHT KEEP
5 1 20-Feb-13 6:01:00 DAY KEEP
5   22-Feb-13 22:00:00 NIGHT DELETE
5   24-Feb-13 22:00:00 NIGHT DELETE
5 1 6-Mar-13 6:01:00 DAY KEEP
5 1 6-Mar-13 22:00:00 NIGHT KEEP
5 2 7-Mar-13 6:01:00 DAY KEEP
5 2 7-Mar-13 22:00:00 NIGHT KEEP
6 1 4-Oct-13 6:00:00 DAY KEEP
6 1 4-Oct-13 22:00:00 NIGHT KEEP
6 2 5-Oct-13 6:00:00 DAY KEEP
6 2 5-Oct-13 22:00:00 NIGHT KEEP
6   7-Oct-13 6:00:00 DAY DELETE
1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

Indeed! I missed the case where it was the last.id and started with an otherwise potentially valid record. This code should correct for that:

 

data want (drop=_:);
  set have;
  by id;
  length action $6;
  retain _next_keep _last_kept_date _pair;
  if first.id then _pair=0;
  if not last.id then do;
    _next=_n_+1;
    set have (keep=start_date value
         rename=(start_date=_next_start
                 value=_next_value)) point=_next;
    if _next_keep='KEEP' then do;
      action='KEEP';
      _last_kept_date=start_date;
      pair=_pair;
      output;
      _next_keep='DELETE';
    end;
    else if start_date eq _next_start and
            value eq 'DAY' and
            _next_value eq 'NIGHT' then do;
      if 0<=(start_date-_last_kept_date)<=1 then _pair+1;
      else _pair=1;
      pair=_pair;
      action='KEEP';
      output;
      _next_keep='KEEP';
    end;
    else do;
      action='DELETE';
      output;
      _next_keep='DELETE';
    end;
  end;
  else do;
    if _next_keep eq 'KEEP' then do;
      action='KEEP';
      pair=_pair;
      output;
      _next_keep='DELETE';
    end;
    else do;
      action='DELETE';
      output;
      _next_keep='DELETE';
    end;
  end;
run;

Art, CEO, AnalystFinder.com

 

View solution in original post

18 REPLIES 18
art297
Opal | Level 21

I think that the following does what you want:

 

data want (drop=_:);
set have;
by id;
retain _next_keep pair;
if first.id then pair=0;
if not last.id then do;
_next=_n_+1;
set have (keep=start_date value
rename=(start_date=_next_start
value=_next_value)) point=_next;
if _next_keep='KEEP' then do;
output;
_next_keep='DELETE';
end;
else if start_date eq _next_start and
value eq 'DAY' and
_next_value eq 'NIGHT' then do;
pair+1;
output;
_next_keep='KEEP';
end;
else _next_keep='DELETE';
end;
else if _next_keep eq 'KEEP' then do;
output;
_next_keep='DELETE';
end;
run;

HTH,

Art, CEO, AnalystFinder.com

 

mkeintz
PROC Star

You identify pairs as any two consecutive records with the same id, same start_date, in which one record has value='DAY' and the other has value='NIGHT', in any order.

 

A pair is assigned a pair number, in which the first pair for an id is pair=1, and subsequent pairs are 2, 3, 4, ...    EXCEPT if a given pair is NOT exactly one day after the previous pair, then it is reset to pair=1.

 

data have;
  input (ID _PAIR) (1. +1 1.) Start_Date :date9. Start_Time time8.0 Value :$5.  _Delete $6.;
  format start_date date9. start_time time8.0;
datalines;
1   12-Feb-14 6:00:00 DAY DELETE 
1 1 12-Feb-14 6:01:00 DAY KEEP 
1 1 12-Feb-14 22:00:00 NIGHT KEEP 
1 2 13-Feb-14 6:00:00 DAY KEEP 
1 2 13-Feb-14 22:00:00 NIGHT KEEP 
1 3 14-Feb-14 6:00:00 DAY KEEP 
1 3 14-Feb-14 22:00:00 NIGHT KEEP 
3   16-Jan-13 22:00:00 NIGHT DELETE 
3 1 17-Jan-13 6:00:00 DAY KEEP 
3 1 17-Jan-13 22:00:00 NIGHT KEEP 
3   18-Jan-13 22:00:00 NIGHT DELETE 
3   21-Jan-13 22:00:00 NIGHT DELETE 
3 1 22-Jan-13 6:00:00 DAY KEEP 
3 1 22-Jan-13 22:00:00 NIGHT KEEP 
3 2 23-Jan-13 6:00:00 DAY KEEP 
3 2 23-Jan-13 22:00:00 NIGHT KEEP 
4   9-Jan-13 22:00:00 NIGHT DELETE 
4   10-Jan-13 22:00:00 NIGHT DELETE 
4 1 11-Jan-13 6:01:00 DAY KEEP 
4 1 11-Jan-13 22:00:00 NIGHT KEEP 
4 2 12-Jan-13 6:01:00 DAY KEEP 
4 2 12-Jan-13 22:00:00 NIGHT KEEP 
5 1 20-Feb-13 22:00:00 NIGHT KEEP 
5 1 20-Feb-13 6:01:00 DAY KEEP 
5   22-Feb-13 22:00:00 NIGHT DELETE 
5   24-Feb-13 22:00:00 NIGHT DELETE 
5 1 6-Mar-13 6:01:00 DAY KEEP 
5 1 6-Mar-13 22:00:00 NIGHT KEEP 
5 2 7-Mar-13 6:01:00 DAY KEEP 
5 2 7-Mar-13 22:00:00 NIGHT KEEP 
6 1 4-Oct-13 6:00:00 DAY KEEP 
6 1 4-Oct-13 22:00:00 NIGHT KEEP 
6 2 5-Oct-13 6:00:00 DAY KEEP 
6 2 5-Oct-13 22:00:00 NIGHT KEEP 
6   7-Oct-13 6:00:00 DAY DELETE 
run;
data want (drop=p nxt_val p_membr);

  set have (keep=id start_date);
  by id start_date;

  retain p 0       /* to track pair number */
         p_membr . /* 1=beginning of pair, 2=end of a pair */;

  if first.start_date then do until (1);
    if not(first.id=1 or lag(start_date)<start_date-1) then leave;
    p=0;
    p_membr=.;
  end;

  merge have 
        have (firstobs=2 keep=id value rename=(value=nxt_val));

  /* If last record was p_membr=1, this one is p_membr=2*/
  if p_membr=1 then do;
    pair=p;
	p_membr=2;
  end;
  else if last.start_date=0 and catx('/',value,nxt_val) in ('DAY/NIGHT','NIGHT/DAY') then do;
	p=p+1;
	pair=p;
	p_membr=1;
  end;
  else p_membr=.;
    
  if p_membr=. then action='DELETE';
  else action='KEEP';
run;

 

 

NOtes:

  1. The set with BY statement is just used to determine first.id, first.start_date, and corresponding last.'s
  2. The self merge has firstobs=2 so that one can look ahead at the next records VALUE variable, renamed to "nxt_val".
  3. P_MEMBR=1 for the first record in a pair, and p_membr=2 for the 2nd.  Since p_membr is a "retained" value, if it's a one, then the prior record was the beginning of a pair, and current record must be the end of the pair.
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
rogerjdeangelis
Barite | Level 11
This may be related to the solution you want. Did not see a example of output

data have;
  input (ID _PAIR) (1. +1 1.) Start_Date :date9. Start_Time time8.0 Value :$5.  _Delete $6.;
  format start_date date9. start_time time8.0;
  if _pair ne '' then do;
     input #1 (ID _PAIR) (1. +1 1.) Start_Date :date9. Start_Time time8.0 Value :$5.  _Delete $6.
           #2 (xID x_PAIR) (1. +1 1.) xStart_Date :date9. xStart_Time time8.0 xValue :$5.  x_Delete $6.;
     if value='DAY' and xvalue='NIGHT' then output;
  end;
cards4;
1   12-Feb-14 6:00:00 DAY DELETE
1 1 12-Feb-14 6:01:00 DAY KEEP
1 1 12-Feb-14 22:00:00 NIGHT KEEP
1 2 13-Feb-14 6:00:00 DAY KEEP
1 2 13-Feb-14 22:00:00 NIGHT KEEP
1 3 14-Feb-14 6:00:00 DAY KEEP
1 3 14-Feb-14 22:00:00 NIGHT KEEP
3   16-Jan-13 22:00:00 NIGHT DELETE
3 1 17-Jan-13 6:00:00 DAY KEEP
3 1 17-Jan-13 22:00:00 NIGHT KEEP
3   18-Jan-13 22:00:00 NIGHT DELETE
3   21-Jan-13 22:00:00 NIGHT DELETE
3 1 22-Jan-13 6:00:00 DAY KEEP
3 1 22-Jan-13 22:00:00 NIGHT KEEP
3 2 23-Jan-13 6:00:00 DAY KEEP
3 2 23-Jan-13 22:00:00 NIGHT KEEP
4   9-Jan-13 22:00:00 NIGHT DELETE
4   10-Jan-13 22:00:00 NIGHT DELETE
4 1 11-Jan-13 6:01:00 DAY KEEP
4 1 11-Jan-13 22:00:00 NIGHT KEEP
4 2 12-Jan-13 6:01:00 DAY KEEP
4 2 12-Jan-13 22:00:00 NIGHT KEEP
5 1 20-Feb-13 22:00:00 NIGHT KEEP
5 1 20-Feb-13 6:01:00 DAY KEEP
5   22-Feb-13 22:00:00 NIGHT DELETE
5   24-Feb-13 22:00:00 NIGHT DELETE
5 1 6-Mar-13 6:01:00 DAY KEEP
5 1 6-Mar-13 22:00:00 NIGHT KEEP
5 2 7-Mar-13 6:01:00 DAY KEEP
5 2 7-Mar-13 22:00:00 NIGHT KEEP
6 1 4-Oct-13 6:00:00 DAY KEEP
6 1 4-Oct-13 22:00:00 NIGHT KEEP
6 2 5-Oct-13 6:00:00 DAY KEEP
6 2 5-Oct-13 22:00:00 NIGHT KEEP
6   7-Oct-13 6:00:00 DAY DELETE
;;;;
run;quit;
WAL83
Obsidian | Level 7

My aplogies for not including an example output dataset. Also it may be easier to understand my goal if I only include variables available in my dataset.

 

data have;

input ID 4. Start_Date :date9. Start_Time time8.0 End_Time time8.0 Value :$5.;

format Start_Date date9. Start_Time time8.0 End_time time8.0;

datalines;

1024 22-Feb-14 6:00 21:59 DAY

1024 22-Feb-14 22:00 5:59 NIGHT

1024 23-Feb-14 6:00 21:59 DAY

1024 23-Feb-14 22:00 5:59 NIGHT

1024 24-Feb-14 6:00 21:59 DAY

1024 24-Feb-14 22:00 5:59 NIGHT

1040 3-Apr-14 6:00 21:59 DAY

1040 3-Apr-14 22:00 5:59 NIGHT

1040 4-Apr-14 22:00 5:59 NIGHT

1040 5-Apr-14 6:00 21:59 DAY

1040 5-Apr-14 22:00 5:59 NIGHT

1040 6-Apr-14 22:00 5:59 NIGHT

1040 7-Apr-14 6:00 21:59 DAY

1040 7-Apr-14 22:00 5:59 NIGHT

2003 1-Jul-13 6:00 21:59 DAY

2003 1-Jul-13 22:00 5:59 NIGHT

2003 2-Jul-13 22:00 5:59 NIGHT

2003 3-Jul-13 6:00 21:59 DAY

2003 3-Jul-13 22:00 5:59 NIGHT

2003 4-Jul-13 6:00 21:59 DAY

2003 4-Jul-13 22:00 5:59 NIGHT

2003 5-Jul-13 6:00 21:59 DAY

2010 2-Jun-13 6:01 21:59 DAY

2010 2-Jun-13 22:00 6:00 NIGHT

2010 16-Jul-13 6:01 21:59 DAY

2010 16-Jul-13 22:00 6:00 NIGHT

2010 17-Jul-13 6:01 21:59 DAY

2010 17-Jul-13 22:00 6:00 NIGHT

2010 24-Jul-13 6:01 21:59 DAY

2010 24-Jul-13 22:00 6:00 NIGHT

2010 25-Jul-13 22:00 6:00 NIGHT

2010 26-Jul-13 22:00 6:00 NIGHT

2010 28-Jul-13 22:00 6:00 NIGHT

2010 29-Jul-13 6:01 21:59 DAY

2010 29-Jul-13 22:00 6:00 NIGHT

2010 1-Aug-13 6:01 21:59 DAY

2010 9-Aug-13 6:01 21:59 DAY

2010 9-Aug-13 22:00 6:00 NIGHT

2010 10-Aug-13 6:01 21:59 DAY

2010 10-Aug-13 22:00 6:00 NIGHT

2010 11-Aug-13 22:00 6:00 NIGHT

2010 12-Aug-13 22:00 6:00 NIGHT

2010 1-Mar-14 22:00 5:59 NIGHT

2010 2-Mar-14 22:00 5:59 NIGHT

2010 3-Mar-14 22:00 5:59 NIGHT

2010 5-Mar-14 6:00 21:59 DAY

2010 5-Mar-14 22:00 5:59 NIGHT

2010 6-Mar-14 6:00 21:59 DAY

2010 6-Mar-14 22:00 5:59 NIGHT

run;

 

Output dataset with addition of 2 new variables added (Action, Pair).

 

Action Pair ID Start Date Start Time End Time Value
KEEP 1 1024 22-Feb-14 6:00 21:59 DAY
KEEP 1 1024 22-Feb-14 22:00 5:59 NIGHT
KEEP 2 1024 23-Feb-14 6:00 21:59 DAY
KEEP 2 1024 23-Feb-14 22:00 5:59 NIGHT
KEEP 3 1024 24-Feb-14 6:00 21:59 DAY
KEEP 3 1024 24-Feb-14 22:00 5:59 NIGHT
KEEP 1 1040 3-Apr-14 6:00 21:59 DAY
KEEP 1 1040 3-Apr-14 22:00 5:59 NIGHT
DELETE . 1040 4-Apr-14 22:00 5:59 NIGHT
KEEP 1 1040 5-Apr-14 6:00 21:59 DAY
KEEP 1 1040 5-Apr-14 22:00 5:59 NIGHT
DELETE . 1040 6-Apr-14 22:00 5:59 NIGHT
KEEP 1 1040 7-Apr-14 6:00 21:59 DAY
KEEP 1 1040 7-Apr-14 22:00 5:59 NIGHT
KEEP 1 2003 1-Jul-13 6:00 21:59 DAY
KEEP 1 2003 1-Jul-13 22:00 5:59 NIGHT
DELETE . 2003 2-Jul-13 22:00 5:59 NIGHT
KEEP 1 2003 3-Jul-13 6:00 21:59 DAY
KEEP 1 2003 3-Jul-13 22:00 5:59 NIGHT
KEEP 2 2003 4-Jul-13 6:00 21:59 DAY
KEEP 2 2003 4-Jul-13 22:00 5:59 NIGHT
DELETE . 2003 5-Jul-13 6:00 21:59 DAY
KEEP 1 2010 2-Jun-13 6:01 21:59 DAY
KEEP 1 2010 2-Jun-13 22:00 6:00 NIGHT
KEEP 1 2010 16-Jul-13 6:01 21:59 DAY
KEEP 1 2010 16-Jul-13 22:00 6:00 NIGHT
KEEP 2 2010 17-Jul-13 6:01 21:59 DAY
KEEP 2 2010 17-Jul-13 22:00 6:00 NIGHT
KEEP 3 2010 24-Jul-13 6:01 21:59 DAY
KEEP 3 2010 24-Jul-13 22:00 6:00 NIGHT
DELETE . 2010 25-Jul-13 22:00 6:00 NIGHT
DELETE . 2010 26-Jul-13 22:00 6:00 NIGHT
DELETE . 2010 28-Jul-13 22:00 6:00 NIGHT
KEEP 1 2010 29-Jul-13 6:01 21:59 DAY
KEEP 1 2010 29-Jul-13 22:00 6:00 NIGHT
DELETE . 2010 1-Aug-13 6:01 21:59 DAY
KEEP 1 2010 9-Aug-13 6:01 21:59 DAY
KEEP 1 2010 9-Aug-13 22:00 6:00 NIGHT
KEEP 2 2010 10-Aug-13 6:01 21:59 DAY
KEEP 2 2010 10-Aug-13 22:00 6:00 NIGHT
DELETE . 2010 11-Aug-13 22:00 6:00 NIGHT
DELETE . 2010 12-Aug-13 22:00 6:00 NIGHT
DELETE . 2010 1-Mar-14 22:00 5:59 NIGHT
DELETE . 2010 2-Mar-14 22:00 5:59 NIGHT
DELETE . 2010 3-Mar-14 22:00 5:59 NIGHT
KEEP 1 2010 5-Mar-14 6:00 21:59 DAY
KEEP 1 2010 5-Mar-14 22:00 5:59 NIGHT
KEEP 2 2010 6-Mar-14 6:00 21:59 DAY
KEEP 2 2010 6-Mar-14 22:00 5:59 NIGHT
art297
Opal | Level 21

The code I originally suggested assumed that you actually wanted to delete those records. The following does what you are currently asking for:

 

data want (drop=_:);
  set have;
  by id;
  length action $6;
  retain _next_keep pair;
  if first.id then pair=0;
  if not last.id then do;
    _next=_n_+1;
    set have (keep=start_date value
         rename=(start_date=_next_start
                 value=_next_value)) point=_next;
    if _next_keep='KEEP' then do;
      action='KEEP';
      output;
      _next_keep='DELETE';
    end;
    else if start_date eq _next_start and
            value eq 'DAY' and
            _next_value eq 'NIGHT' then do;
      pair+1;
      action='KEEP';
      output;
      _next_keep='KEEP';
    end;
    else do;
      action='DELETE';
      output;
      _next_keep='DELETE';
    end;
  end;
  else if _next_keep eq 'KEEP' then do;
    action='KEEP';
    output;
    _next_keep='DELETE';
  end;
run;

HTH,

Art, CEO, AnalystFinder.com

WAL83
Obsidian | Level 7

Thanks for the quick reply. Your adjusted code achieves all my goals except reset pair count to 1 if there are non-contiguous date records. See records in bold red in below example output dataset.

 

Action Pair ID Start Date Start Time End Time Value
KEEP 1 1024 22-Feb-14 6:00 21:59 DAY
KEEP 1 1024 22-Feb-14 22:00 5:59 NIGHT
KEEP 2 1024 23-Feb-14 6:00 21:59 DAY
KEEP 2 1024 23-Feb-14 22:00 5:59 NIGHT
KEEP 3 1024 24-Feb-14 6:00 21:59 DAY
KEEP 3 1024 24-Feb-14 22:00 5:59 NIGHT
KEEP 1 1040 3-Apr-14 6:00 21:59 DAY
KEEP 1 1040 3-Apr-14 22:00 5:59 NIGHT
DELETE . 1040 4-Apr-14 22:00 5:59 NIGHT
KEEP 1 1040 5-Apr-14 6:00 21:59 DAY
KEEP 1 1040 5-Apr-14 22:00 5:59 NIGHT
DELETE . 1040 6-Apr-14 22:00 5:59 NIGHT
KEEP 1 1040 7-Apr-14 6:00 21:59 DAY
KEEP 1 1040 7-Apr-14 22:00 5:59 NIGHT
KEEP 1 2003 1-Jul-13 6:00 21:59 DAY
KEEP 1 2003 1-Jul-13 22:00 5:59 NIGHT
DELETE . 2003 2-Jul-13 22:00 5:59 NIGHT
KEEP 1 2003 3-Jul-13 6:00 21:59 DAY
KEEP 1 2003 3-Jul-13 22:00 5:59 NIGHT
KEEP 2 2003 4-Jul-13 6:00 21:59 DAY
KEEP 2 2003 4-Jul-13 22:00 5:59 NIGHT
DELETE . 2003 5-Jul-13 6:00 21:59 DAY
KEEP 1 2010 2-Jun-13 6:01 21:59 DAY
KEEP 1 2010 2-Jun-13 22:00 6:00 NIGHT
KEEP 1 2010 16-Jul-13 6:01 21:59 DAY
KEEP 1 2010 16-Jul-13 22:00 6:00 NIGHT
KEEP 2 2010 17-Jul-13 6:01 21:59 DAY
KEEP 2 2010 17-Jul-13 22:00 6:00 NIGHT
KEEP 3 2010 24-Jul-13 6:01 21:59 DAY
KEEP 3 2010 24-Jul-13 22:00 6:00 NIGHT
DELETE . 2010 25-Jul-13 22:00 6:00 NIGHT
DELETE . 2010 26-Jul-13 22:00 6:00 NIGHT
DELETE . 2010 28-Jul-13 22:00 6:00 NIGHT
KEEP 1 2010 29-Jul-13 6:01 21:59 DAY
KEEP 1 2010 29-Jul-13 22:00 6:00 NIGHT
DELETE . 2010 1-Aug-13 6:01 21:59 DAY
KEEP 1 2010 9-Aug-13 6:01 21:59 DAY
KEEP 1 2010 9-Aug-13 22:00 6:00 NIGHT
KEEP 2 2010 10-Aug-13 6:01 21:59 DAY
KEEP 2 2010 10-Aug-13 22:00 6:00 NIGHT
DELETE . 2010 11-Aug-13 22:00 6:00 NIGHT
DELETE . 2010 12-Aug-13 22:00 6:00 NIGHT
DELETE . 2010 1-Mar-14 22:00 5:59 NIGHT
DELETE . 2010 2-Mar-14 22:00 5:59 NIGHT
DELETE . 2010 3-Mar-14 22:00 5:59 NIGHT
KEEP 1 2010 5-Mar-14 6:00 21:59 DAY
KEEP 1 2010 5-Mar-14 22:00 5:59 NIGHT
KEEP 2 2010 6-Mar-14 6:00 21:59 DAY
KEEP 2 2010 6-Mar-14 22:00 5:59 NIGHT

 

 

art297
Opal | Level 21

I question your pair assignment for the following two records:

KEEP 3 2010 24-Jul-13 6:01 21:59 DAY
KEEP 3 2010 24-Jul-13 22:00 6:00 NIGHT

 

according to your rules, I'd think they should be 1s.

 

If so, the following should work:

 

data want (drop=_:);
  set have;
  by id;
  length action $6;
  retain _next_keep _last_kept_date _pair;
  if first.id then _pair=0;
  if not last.id then do;
    _next=_n_+1;
    set have (keep=start_date value
         rename=(start_date=_next_start
                 value=_next_value)) point=_next;
    if _next_keep='KEEP' then do;
      action='KEEP';
      _last_kept_date=start_date;
      pair=_pair;
      output;
      _next_keep='DELETE';
    end;
    else if start_date eq _next_start and
            value eq 'DAY' and
            _next_value eq 'NIGHT' then do;
      if 0<=(start_date-_last_kept_date)<=1 then _pair+1;
      else _pair=1;
      pair=_pair;
      action='KEEP';
      output;
      _next_keep='KEEP';
    end;
    else do;
      action='DELETE';
      output;
      _next_keep='DELETE';
    end;
  end;
  else if _next_keep eq 'KEEP' then do;
    action='KEEP';
    pair=_pair;
    output;
    _next_keep='DELETE';
  end;
run;

HTH,

Art, CEO, AnalystFinder.com

 

mkeintz
PROC Star

This code produces what you want, includng the reset to pair=1 when a pair is more than one day after the most preceding pair.

 

HOWEVER, it appears your sample answer is wrong for the 24-July-2013 pair, which you have assigned as pair=3.  But the preceding pair is 1 week prior (17-JUL-13), so the 24-jul pair should be pair=1.

 

data want (drop=p nxt_val p_membr pair_date);

  set have (keep=id start_date);
  by id start_date;

  retain p 0       /* to track pair number */
         p_membr . /* 1=beginning of pair, 2=end of a pair */
         pair_date /* Date of most recent pair */

  if first.start_date=1 then do;
    if first.id then p=0;
    if pair_date < start_date-1 then p=0;
  end;

  merge have 
        have (firstobs=2 keep=id value rename=(value=nxt_val));

  /* If retain p_membr=1, this one must be p_membr=2*/
  if p_membr=1 then do;
    pair=p;
    p_membr=2;
  end;
  else if last.start_date=0 and catx('/',value,nxt_val) in ('DAY/NIGHT','NIGHT/DAY') then do;
    p_membr=1;
    pair_date=start_date;
    p=p+1;
    pair=p;
  end;
  else p_membr=.;
    
  if p_membr=. then action='DELETE';
  else action='KEEP';

run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
WAL83
Obsidian | Level 7

Aye, yes you are correct. Long day 😞

 

Your code does work very well although it renumbers the last pair id to the next id.. See example output dataset below...

 

ID Start_Date Start_Time End_Time Value pair action
1024 22-Feb-14 6:00:00 21:59:00 DAY 1 KEEP
1024 22-Feb-14 22:00:00 5:59:00 NIGHT 1 KEEP
1024 23-Feb-14 6:00:00 21:59:00 DAY 2 KEEP
1024 23-Feb-14 22:00:00 5:59:00 NIGHT 2 KEEP
1024 24-Feb-14 6:00:00 21:59:00 DAY 3 KEEP
1040 24-Feb-14 22:00:00 5:59:00 NIGHT 3 KEEP
1040 3-Apr-14 6:00:00 21:59:00 DAY 1 KEEP
1040 3-Apr-14 22:00:00 5:59:00 NIGHT 1 KEEP
1040 4-Apr-14 22:00:00 5:59:00 NIGHT . DELETE
1040 5-Apr-14 6:00:00 21:59:00 DAY 2 KEEP
1040 5-Apr-14 22:00:00 5:59:00 NIGHT 2 KEEP
1040 6-Apr-14 22:00:00 5:59:00 NIGHT . DELETE
1040 7-Apr-14 6:00:00 21:59:00 DAY 3 KEEP
2003 7-Apr-14 22:00:00 5:59:00 NIGHT 3 KEEP
2003 1-Jul-13 6:00:00 21:59:00 DAY 1 KEEP
2003 1-Jul-13 22:00:00 5:59:00 NIGHT 1 KEEP
2003 2-Jul-13 22:00:00 5:59:00 NIGHT . DELETE
2003 3-Jul-13 6:00:00 21:59:00 DAY 2 KEEP
2003 3-Jul-13 22:00:00 5:59:00 NIGHT 2 KEEP
2003 4-Jul-13 6:00:00 21:59:00 DAY 3 KEEP
2003 4-Jul-13 22:00:00 5:59:00 NIGHT 3 KEEP
2010 5-Jul-13 6:00:00 21:59:00 DAY . DELETE
2010 2-Jun-13 6:01:00 21:59:00 DAY 1 KEEP
2010 2-Jun-13 22:00:00 6:00:00 NIGHT 1 KEEP
2010 16-Jul-13 6:01:00 21:59:00 DAY 1 KEEP
2010 16-Jul-13 22:00:00 6:00:00 NIGHT 1 KEEP
2010 17-Jul-13 6:01:00 21:59:00 DAY 2 KEEP
2010 17-Jul-13 22:00:00 6:00:00 NIGHT 2 KEEP
2010 24-Jul-13 6:01:00 21:59:00 DAY 1 KEEP
2010 24-Jul-13 22:00:00 6:00:00 NIGHT 1 KEEP
2010 25-Jul-13 22:00:00 6:00:00 NIGHT . DELETE
2010 26-Jul-13 22:00:00 6:00:00 NIGHT . DELETE
2010 28-Jul-13 22:00:00 6:00:00 NIGHT . DELETE
2010 29-Jul-13 6:01:00 21:59:00 DAY 1 KEEP
2010 29-Jul-13 22:00:00 6:00:00 NIGHT 1 KEEP
2010 1-Aug-13 6:01:00 21:59:00 DAY . DELETE
2010 9-Aug-13 6:01:00 21:59:00 DAY 1 KEEP
2010 9-Aug-13 22:00:00 6:00:00 NIGHT 1 KEEP
2010 10-Aug-13 6:01:00 21:59:00 DAY 2 KEEP
2010 10-Aug-13 22:00:00 6:00:00 NIGHT 2 KEEP
2010 11-Aug-13 22:00:00 6:00:00 NIGHT . DELETE
2010 12-Aug-13 22:00:00 6:00:00 NIGHT . DELETE
2010 1-Mar-14 22:00:00 5:59:00 NIGHT . DELETE
2010 2-Mar-14 22:00:00 5:59:00 NIGHT . DELETE
2010 3-Mar-14 22:00:00 5:59:00 NIGHT . DELETE
2010 5-Mar-14 6:00:00 21:59:00 DAY 1 KEEP
2010 5-Mar-14 22:00:00 5:59:00 NIGHT 1 KEEP
2010 6-Mar-14 6:00:00 21:59:00 DAY 2 KEEP
2010 6-Mar-14 22:00:00 5:59:00 NIGHT 2 KEEP
art297
Opal | Level 21

Did you try my revised code?

 

Art, CEO, AnalystFinder.com

WAL83
Obsidian | Level 7

Yes. It works except for the resetting of pair = 1 when non-contiguous date occurs.

art297
Opal | Level 21

Not the code I posted about a half hour ago .. I don't think!

 

Art, CEO, AnalystFinder.com

 

WAL83
Obsidian | Level 7

I think it's working. I'm double checking. I'm getting tired 😞

mkeintz
PROC Star

Please mark @art297's message with the working code as the solution, not his request for you to test it.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Catch up on SAS Innovate 2026

Nearly 200 sessions are now available on demand with the SAS Innovate Digital Pass.

Explore Now →
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 18 replies
  • 7633 views
  • 5 likes
  • 4 in conversation