BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
WAL83
Obsidian | Level 7

Hello,

 

I'm trying to determine the number of patients in my dataset that have at least X contiguos paired records by date (DAY & NIGHT RECORD=PAIR). I need to delete records that do not have a PAIR and then create "PAIR#" value to flag contiguous records for the patient. PAIR# = 3 would indicate this is the 3rd contiguous PAIR of records for this patient. Start "PAIR#" over again at 1 if there are non-contiguous paired date records for a patient, in other words PAIR# does not have to be unique for a patient. See below example dataset.

 

ID PAIR# Start Date Start Time Value Delete?
1   12-Feb-14 6:00:00 DAY DELETE
1 1 12-Feb-14 6:01:00 DAY KEEP
1 1 12-Feb-14 22:00:00 NIGHT KEEP
1 2 13-Feb-14 6:00:00 DAY KEEP
1 2 13-Feb-14 22:00:00 NIGHT KEEP
1 3 14-Feb-14 6:00:00 DAY KEEP
1 3 14-Feb-14 22:00:00 NIGHT KEEP
3   16-Jan-13 22:00:00 NIGHT DELETE
3 1 17-Jan-13 6:00:00 DAY KEEP
3 1 17-Jan-13 22:00:00 NIGHT KEEP
3   18-Jan-13 22:00:00 NIGHT DELETE
3   21-Jan-13 22:00:00 NIGHT DELETE
3 1 22-Jan-13 6:00:00 DAY KEEP
3 1 22-Jan-13 22:00:00 NIGHT KEEP
3 2 23-Jan-13 6:00:00 DAY KEEP
3 2 23-Jan-13 22:00:00 NIGHT KEEP
4   9-Jan-13 22:00:00 NIGHT DELETE
4   10-Jan-13 22:00:00 NIGHT DELETE
4 1 11-Jan-13 6:01:00 DAY KEEP
4 1 11-Jan-13 22:00:00 NIGHT KEEP
4 2 12-Jan-13 6:01:00 DAY KEEP
4 2 12-Jan-13 22:00:00 NIGHT KEEP
5 1 20-Feb-13 22:00:00 NIGHT KEEP
5 1 20-Feb-13 6:01:00 DAY KEEP
5   22-Feb-13 22:00:00 NIGHT DELETE
5   24-Feb-13 22:00:00 NIGHT DELETE
5 1 6-Mar-13 6:01:00 DAY KEEP
5 1 6-Mar-13 22:00:00 NIGHT KEEP
5 2 7-Mar-13 6:01:00 DAY KEEP
5 2 7-Mar-13 22:00:00 NIGHT KEEP
6 1 4-Oct-13 6:00:00 DAY KEEP
6 1 4-Oct-13 22:00:00 NIGHT KEEP
6 2 5-Oct-13 6:00:00 DAY KEEP
6 2 5-Oct-13 22:00:00 NIGHT KEEP
6   7-Oct-13 6:00:00 DAY DELETE
1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

Indeed! I missed the case where it was the last.id and started with an otherwise potentially valid record. This code should correct for that:

 

data want (drop=_:);
  set have;
  by id;
  length action $6;
  retain _next_keep _last_kept_date _pair;
  if first.id then _pair=0;
  if not last.id then do;
    _next=_n_+1;
    set have (keep=start_date value
         rename=(start_date=_next_start
                 value=_next_value)) point=_next;
    if _next_keep='KEEP' then do;
      action='KEEP';
      _last_kept_date=start_date;
      pair=_pair;
      output;
      _next_keep='DELETE';
    end;
    else if start_date eq _next_start and
            value eq 'DAY' and
            _next_value eq 'NIGHT' then do;
      if 0<=(start_date-_last_kept_date)<=1 then _pair+1;
      else _pair=1;
      pair=_pair;
      action='KEEP';
      output;
      _next_keep='KEEP';
    end;
    else do;
      action='DELETE';
      output;
      _next_keep='DELETE';
    end;
  end;
  else do;
    if _next_keep eq 'KEEP' then do;
      action='KEEP';
      pair=_pair;
      output;
      _next_keep='DELETE';
    end;
    else do;
      action='DELETE';
      output;
      _next_keep='DELETE';
    end;
  end;
run;

Art, CEO, AnalystFinder.com

 

View solution in original post

18 REPLIES 18
art297
Opal | Level 21

I think that the following does what you want:

 

data want (drop=_:);
set have;
by id;
retain _next_keep pair;
if first.id then pair=0;
if not last.id then do;
_next=_n_+1;
set have (keep=start_date value
rename=(start_date=_next_start
value=_next_value)) point=_next;
if _next_keep='KEEP' then do;
output;
_next_keep='DELETE';
end;
else if start_date eq _next_start and
value eq 'DAY' and
_next_value eq 'NIGHT' then do;
pair+1;
output;
_next_keep='KEEP';
end;
else _next_keep='DELETE';
end;
else if _next_keep eq 'KEEP' then do;
output;
_next_keep='DELETE';
end;
run;

HTH,

Art, CEO, AnalystFinder.com

 

mkeintz
PROC Star

You identify pairs as any two consecutive records with the same id, same start_date, in which one record has value='DAY' and the other has value='NIGHT', in any order.

 

A pair is assigned a pair number, in which the first pair for an id is pair=1, and subsequent pairs are 2, 3, 4, ...    EXCEPT if a given pair is NOT exactly one day after the previous pair, then it is reset to pair=1.

 

data have;
  input (ID _PAIR) (1. +1 1.) Start_Date :date9. Start_Time time8.0 Value :$5.  _Delete $6.;
  format start_date date9. start_time time8.0;
datalines;
1   12-Feb-14 6:00:00 DAY DELETE 
1 1 12-Feb-14 6:01:00 DAY KEEP 
1 1 12-Feb-14 22:00:00 NIGHT KEEP 
1 2 13-Feb-14 6:00:00 DAY KEEP 
1 2 13-Feb-14 22:00:00 NIGHT KEEP 
1 3 14-Feb-14 6:00:00 DAY KEEP 
1 3 14-Feb-14 22:00:00 NIGHT KEEP 
3   16-Jan-13 22:00:00 NIGHT DELETE 
3 1 17-Jan-13 6:00:00 DAY KEEP 
3 1 17-Jan-13 22:00:00 NIGHT KEEP 
3   18-Jan-13 22:00:00 NIGHT DELETE 
3   21-Jan-13 22:00:00 NIGHT DELETE 
3 1 22-Jan-13 6:00:00 DAY KEEP 
3 1 22-Jan-13 22:00:00 NIGHT KEEP 
3 2 23-Jan-13 6:00:00 DAY KEEP 
3 2 23-Jan-13 22:00:00 NIGHT KEEP 
4   9-Jan-13 22:00:00 NIGHT DELETE 
4   10-Jan-13 22:00:00 NIGHT DELETE 
4 1 11-Jan-13 6:01:00 DAY KEEP 
4 1 11-Jan-13 22:00:00 NIGHT KEEP 
4 2 12-Jan-13 6:01:00 DAY KEEP 
4 2 12-Jan-13 22:00:00 NIGHT KEEP 
5 1 20-Feb-13 22:00:00 NIGHT KEEP 
5 1 20-Feb-13 6:01:00 DAY KEEP 
5   22-Feb-13 22:00:00 NIGHT DELETE 
5   24-Feb-13 22:00:00 NIGHT DELETE 
5 1 6-Mar-13 6:01:00 DAY KEEP 
5 1 6-Mar-13 22:00:00 NIGHT KEEP 
5 2 7-Mar-13 6:01:00 DAY KEEP 
5 2 7-Mar-13 22:00:00 NIGHT KEEP 
6 1 4-Oct-13 6:00:00 DAY KEEP 
6 1 4-Oct-13 22:00:00 NIGHT KEEP 
6 2 5-Oct-13 6:00:00 DAY KEEP 
6 2 5-Oct-13 22:00:00 NIGHT KEEP 
6   7-Oct-13 6:00:00 DAY DELETE 
run;
data want (drop=p nxt_val p_membr);

  set have (keep=id start_date);
  by id start_date;

  retain p 0       /* to track pair number */
         p_membr . /* 1=beginning of pair, 2=end of a pair */;

  if first.start_date then do until (1);
    if not(first.id=1 or lag(start_date)<start_date-1) then leave;
    p=0;
    p_membr=.;
  end;

  merge have 
        have (firstobs=2 keep=id value rename=(value=nxt_val));

  /* If last record was p_membr=1, this one is p_membr=2*/
  if p_membr=1 then do;
    pair=p;
	p_membr=2;
  end;
  else if last.start_date=0 and catx('/',value,nxt_val) in ('DAY/NIGHT','NIGHT/DAY') then do;
	p=p+1;
	pair=p;
	p_membr=1;
  end;
  else p_membr=.;
    
  if p_membr=. then action='DELETE';
  else action='KEEP';
run;

 

 

NOtes:

  1. The set with BY statement is just used to determine first.id, first.start_date, and corresponding last.'s
  2. The self merge has firstobs=2 so that one can look ahead at the next records VALUE variable, renamed to "nxt_val".
  3. P_MEMBR=1 for the first record in a pair, and p_membr=2 for the 2nd.  Since p_membr is a "retained" value, if it's a one, then the prior record was the beginning of a pair, and current record must be the end of the pair.
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
rogerjdeangelis
Barite | Level 11
This may be related to the solution you want. Did not see a example of output

data have;
  input (ID _PAIR) (1. +1 1.) Start_Date :date9. Start_Time time8.0 Value :$5.  _Delete $6.;
  format start_date date9. start_time time8.0;
  if _pair ne '' then do;
     input #1 (ID _PAIR) (1. +1 1.) Start_Date :date9. Start_Time time8.0 Value :$5.  _Delete $6.
           #2 (xID x_PAIR) (1. +1 1.) xStart_Date :date9. xStart_Time time8.0 xValue :$5.  x_Delete $6.;
     if value='DAY' and xvalue='NIGHT' then output;
  end;
cards4;
1   12-Feb-14 6:00:00 DAY DELETE
1 1 12-Feb-14 6:01:00 DAY KEEP
1 1 12-Feb-14 22:00:00 NIGHT KEEP
1 2 13-Feb-14 6:00:00 DAY KEEP
1 2 13-Feb-14 22:00:00 NIGHT KEEP
1 3 14-Feb-14 6:00:00 DAY KEEP
1 3 14-Feb-14 22:00:00 NIGHT KEEP
3   16-Jan-13 22:00:00 NIGHT DELETE
3 1 17-Jan-13 6:00:00 DAY KEEP
3 1 17-Jan-13 22:00:00 NIGHT KEEP
3   18-Jan-13 22:00:00 NIGHT DELETE
3   21-Jan-13 22:00:00 NIGHT DELETE
3 1 22-Jan-13 6:00:00 DAY KEEP
3 1 22-Jan-13 22:00:00 NIGHT KEEP
3 2 23-Jan-13 6:00:00 DAY KEEP
3 2 23-Jan-13 22:00:00 NIGHT KEEP
4   9-Jan-13 22:00:00 NIGHT DELETE
4   10-Jan-13 22:00:00 NIGHT DELETE
4 1 11-Jan-13 6:01:00 DAY KEEP
4 1 11-Jan-13 22:00:00 NIGHT KEEP
4 2 12-Jan-13 6:01:00 DAY KEEP
4 2 12-Jan-13 22:00:00 NIGHT KEEP
5 1 20-Feb-13 22:00:00 NIGHT KEEP
5 1 20-Feb-13 6:01:00 DAY KEEP
5   22-Feb-13 22:00:00 NIGHT DELETE
5   24-Feb-13 22:00:00 NIGHT DELETE
5 1 6-Mar-13 6:01:00 DAY KEEP
5 1 6-Mar-13 22:00:00 NIGHT KEEP
5 2 7-Mar-13 6:01:00 DAY KEEP
5 2 7-Mar-13 22:00:00 NIGHT KEEP
6 1 4-Oct-13 6:00:00 DAY KEEP
6 1 4-Oct-13 22:00:00 NIGHT KEEP
6 2 5-Oct-13 6:00:00 DAY KEEP
6 2 5-Oct-13 22:00:00 NIGHT KEEP
6   7-Oct-13 6:00:00 DAY DELETE
;;;;
run;quit;
WAL83
Obsidian | Level 7

My aplogies for not including an example output dataset. Also it may be easier to understand my goal if I only include variables available in my dataset.

 

data have;

input ID 4. Start_Date :date9. Start_Time time8.0 End_Time time8.0 Value :$5.;

format Start_Date date9. Start_Time time8.0 End_time time8.0;

datalines;

1024 22-Feb-14 6:00 21:59 DAY

1024 22-Feb-14 22:00 5:59 NIGHT

1024 23-Feb-14 6:00 21:59 DAY

1024 23-Feb-14 22:00 5:59 NIGHT

1024 24-Feb-14 6:00 21:59 DAY

1024 24-Feb-14 22:00 5:59 NIGHT

1040 3-Apr-14 6:00 21:59 DAY

1040 3-Apr-14 22:00 5:59 NIGHT

1040 4-Apr-14 22:00 5:59 NIGHT

1040 5-Apr-14 6:00 21:59 DAY

1040 5-Apr-14 22:00 5:59 NIGHT

1040 6-Apr-14 22:00 5:59 NIGHT

1040 7-Apr-14 6:00 21:59 DAY

1040 7-Apr-14 22:00 5:59 NIGHT

2003 1-Jul-13 6:00 21:59 DAY

2003 1-Jul-13 22:00 5:59 NIGHT

2003 2-Jul-13 22:00 5:59 NIGHT

2003 3-Jul-13 6:00 21:59 DAY

2003 3-Jul-13 22:00 5:59 NIGHT

2003 4-Jul-13 6:00 21:59 DAY

2003 4-Jul-13 22:00 5:59 NIGHT

2003 5-Jul-13 6:00 21:59 DAY

2010 2-Jun-13 6:01 21:59 DAY

2010 2-Jun-13 22:00 6:00 NIGHT

2010 16-Jul-13 6:01 21:59 DAY

2010 16-Jul-13 22:00 6:00 NIGHT

2010 17-Jul-13 6:01 21:59 DAY

2010 17-Jul-13 22:00 6:00 NIGHT

2010 24-Jul-13 6:01 21:59 DAY

2010 24-Jul-13 22:00 6:00 NIGHT

2010 25-Jul-13 22:00 6:00 NIGHT

2010 26-Jul-13 22:00 6:00 NIGHT

2010 28-Jul-13 22:00 6:00 NIGHT

2010 29-Jul-13 6:01 21:59 DAY

2010 29-Jul-13 22:00 6:00 NIGHT

2010 1-Aug-13 6:01 21:59 DAY

2010 9-Aug-13 6:01 21:59 DAY

2010 9-Aug-13 22:00 6:00 NIGHT

2010 10-Aug-13 6:01 21:59 DAY

2010 10-Aug-13 22:00 6:00 NIGHT

2010 11-Aug-13 22:00 6:00 NIGHT

2010 12-Aug-13 22:00 6:00 NIGHT

2010 1-Mar-14 22:00 5:59 NIGHT

2010 2-Mar-14 22:00 5:59 NIGHT

2010 3-Mar-14 22:00 5:59 NIGHT

2010 5-Mar-14 6:00 21:59 DAY

2010 5-Mar-14 22:00 5:59 NIGHT

2010 6-Mar-14 6:00 21:59 DAY

2010 6-Mar-14 22:00 5:59 NIGHT

run;

 

Output dataset with addition of 2 new variables added (Action, Pair).

 

Action Pair ID Start Date Start Time End Time Value
KEEP 1 1024 22-Feb-14 6:00 21:59 DAY
KEEP 1 1024 22-Feb-14 22:00 5:59 NIGHT
KEEP 2 1024 23-Feb-14 6:00 21:59 DAY
KEEP 2 1024 23-Feb-14 22:00 5:59 NIGHT
KEEP 3 1024 24-Feb-14 6:00 21:59 DAY
KEEP 3 1024 24-Feb-14 22:00 5:59 NIGHT
KEEP 1 1040 3-Apr-14 6:00 21:59 DAY
KEEP 1 1040 3-Apr-14 22:00 5:59 NIGHT
DELETE . 1040 4-Apr-14 22:00 5:59 NIGHT
KEEP 1 1040 5-Apr-14 6:00 21:59 DAY
KEEP 1 1040 5-Apr-14 22:00 5:59 NIGHT
DELETE . 1040 6-Apr-14 22:00 5:59 NIGHT
KEEP 1 1040 7-Apr-14 6:00 21:59 DAY
KEEP 1 1040 7-Apr-14 22:00 5:59 NIGHT
KEEP 1 2003 1-Jul-13 6:00 21:59 DAY
KEEP 1 2003 1-Jul-13 22:00 5:59 NIGHT
DELETE . 2003 2-Jul-13 22:00 5:59 NIGHT
KEEP 1 2003 3-Jul-13 6:00 21:59 DAY
KEEP 1 2003 3-Jul-13 22:00 5:59 NIGHT
KEEP 2 2003 4-Jul-13 6:00 21:59 DAY
KEEP 2 2003 4-Jul-13 22:00 5:59 NIGHT
DELETE . 2003 5-Jul-13 6:00 21:59 DAY
KEEP 1 2010 2-Jun-13 6:01 21:59 DAY
KEEP 1 2010 2-Jun-13 22:00 6:00 NIGHT
KEEP 1 2010 16-Jul-13 6:01 21:59 DAY
KEEP 1 2010 16-Jul-13 22:00 6:00 NIGHT
KEEP 2 2010 17-Jul-13 6:01 21:59 DAY
KEEP 2 2010 17-Jul-13 22:00 6:00 NIGHT
KEEP 3 2010 24-Jul-13 6:01 21:59 DAY
KEEP 3 2010 24-Jul-13 22:00 6:00 NIGHT
DELETE . 2010 25-Jul-13 22:00 6:00 NIGHT
DELETE . 2010 26-Jul-13 22:00 6:00 NIGHT
DELETE . 2010 28-Jul-13 22:00 6:00 NIGHT
KEEP 1 2010 29-Jul-13 6:01 21:59 DAY
KEEP 1 2010 29-Jul-13 22:00 6:00 NIGHT
DELETE . 2010 1-Aug-13 6:01 21:59 DAY
KEEP 1 2010 9-Aug-13 6:01 21:59 DAY
KEEP 1 2010 9-Aug-13 22:00 6:00 NIGHT
KEEP 2 2010 10-Aug-13 6:01 21:59 DAY
KEEP 2 2010 10-Aug-13 22:00 6:00 NIGHT
DELETE . 2010 11-Aug-13 22:00 6:00 NIGHT
DELETE . 2010 12-Aug-13 22:00 6:00 NIGHT
DELETE . 2010 1-Mar-14 22:00 5:59 NIGHT
DELETE . 2010 2-Mar-14 22:00 5:59 NIGHT
DELETE . 2010 3-Mar-14 22:00 5:59 NIGHT
KEEP 1 2010 5-Mar-14 6:00 21:59 DAY
KEEP 1 2010 5-Mar-14 22:00 5:59 NIGHT
KEEP 2 2010 6-Mar-14 6:00 21:59 DAY
KEEP 2 2010 6-Mar-14 22:00 5:59 NIGHT
art297
Opal | Level 21

The code I originally suggested assumed that you actually wanted to delete those records. The following does what you are currently asking for:

 

data want (drop=_:);
  set have;
  by id;
  length action $6;
  retain _next_keep pair;
  if first.id then pair=0;
  if not last.id then do;
    _next=_n_+1;
    set have (keep=start_date value
         rename=(start_date=_next_start
                 value=_next_value)) point=_next;
    if _next_keep='KEEP' then do;
      action='KEEP';
      output;
      _next_keep='DELETE';
    end;
    else if start_date eq _next_start and
            value eq 'DAY' and
            _next_value eq 'NIGHT' then do;
      pair+1;
      action='KEEP';
      output;
      _next_keep='KEEP';
    end;
    else do;
      action='DELETE';
      output;
      _next_keep='DELETE';
    end;
  end;
  else if _next_keep eq 'KEEP' then do;
    action='KEEP';
    output;
    _next_keep='DELETE';
  end;
run;

HTH,

Art, CEO, AnalystFinder.com

WAL83
Obsidian | Level 7

Thanks for the quick reply. Your adjusted code achieves all my goals except reset pair count to 1 if there are non-contiguous date records. See records in bold red in below example output dataset.

 

Action Pair ID Start Date Start Time End Time Value
KEEP 1 1024 22-Feb-14 6:00 21:59 DAY
KEEP 1 1024 22-Feb-14 22:00 5:59 NIGHT
KEEP 2 1024 23-Feb-14 6:00 21:59 DAY
KEEP 2 1024 23-Feb-14 22:00 5:59 NIGHT
KEEP 3 1024 24-Feb-14 6:00 21:59 DAY
KEEP 3 1024 24-Feb-14 22:00 5:59 NIGHT
KEEP 1 1040 3-Apr-14 6:00 21:59 DAY
KEEP 1 1040 3-Apr-14 22:00 5:59 NIGHT
DELETE . 1040 4-Apr-14 22:00 5:59 NIGHT
KEEP 1 1040 5-Apr-14 6:00 21:59 DAY
KEEP 1 1040 5-Apr-14 22:00 5:59 NIGHT
DELETE . 1040 6-Apr-14 22:00 5:59 NIGHT
KEEP 1 1040 7-Apr-14 6:00 21:59 DAY
KEEP 1 1040 7-Apr-14 22:00 5:59 NIGHT
KEEP 1 2003 1-Jul-13 6:00 21:59 DAY
KEEP 1 2003 1-Jul-13 22:00 5:59 NIGHT
DELETE . 2003 2-Jul-13 22:00 5:59 NIGHT
KEEP 1 2003 3-Jul-13 6:00 21:59 DAY
KEEP 1 2003 3-Jul-13 22:00 5:59 NIGHT
KEEP 2 2003 4-Jul-13 6:00 21:59 DAY
KEEP 2 2003 4-Jul-13 22:00 5:59 NIGHT
DELETE . 2003 5-Jul-13 6:00 21:59 DAY
KEEP 1 2010 2-Jun-13 6:01 21:59 DAY
KEEP 1 2010 2-Jun-13 22:00 6:00 NIGHT
KEEP 1 2010 16-Jul-13 6:01 21:59 DAY
KEEP 1 2010 16-Jul-13 22:00 6:00 NIGHT
KEEP 2 2010 17-Jul-13 6:01 21:59 DAY
KEEP 2 2010 17-Jul-13 22:00 6:00 NIGHT
KEEP 3 2010 24-Jul-13 6:01 21:59 DAY
KEEP 3 2010 24-Jul-13 22:00 6:00 NIGHT
DELETE . 2010 25-Jul-13 22:00 6:00 NIGHT
DELETE . 2010 26-Jul-13 22:00 6:00 NIGHT
DELETE . 2010 28-Jul-13 22:00 6:00 NIGHT
KEEP 1 2010 29-Jul-13 6:01 21:59 DAY
KEEP 1 2010 29-Jul-13 22:00 6:00 NIGHT
DELETE . 2010 1-Aug-13 6:01 21:59 DAY
KEEP 1 2010 9-Aug-13 6:01 21:59 DAY
KEEP 1 2010 9-Aug-13 22:00 6:00 NIGHT
KEEP 2 2010 10-Aug-13 6:01 21:59 DAY
KEEP 2 2010 10-Aug-13 22:00 6:00 NIGHT
DELETE . 2010 11-Aug-13 22:00 6:00 NIGHT
DELETE . 2010 12-Aug-13 22:00 6:00 NIGHT
DELETE . 2010 1-Mar-14 22:00 5:59 NIGHT
DELETE . 2010 2-Mar-14 22:00 5:59 NIGHT
DELETE . 2010 3-Mar-14 22:00 5:59 NIGHT
KEEP 1 2010 5-Mar-14 6:00 21:59 DAY
KEEP 1 2010 5-Mar-14 22:00 5:59 NIGHT
KEEP 2 2010 6-Mar-14 6:00 21:59 DAY
KEEP 2 2010 6-Mar-14 22:00 5:59 NIGHT

 

 

art297
Opal | Level 21

I question your pair assignment for the following two records:

KEEP 3 2010 24-Jul-13 6:01 21:59 DAY
KEEP 3 2010 24-Jul-13 22:00 6:00 NIGHT

 

according to your rules, I'd think they should be 1s.

 

If so, the following should work:

 

data want (drop=_:);
  set have;
  by id;
  length action $6;
  retain _next_keep _last_kept_date _pair;
  if first.id then _pair=0;
  if not last.id then do;
    _next=_n_+1;
    set have (keep=start_date value
         rename=(start_date=_next_start
                 value=_next_value)) point=_next;
    if _next_keep='KEEP' then do;
      action='KEEP';
      _last_kept_date=start_date;
      pair=_pair;
      output;
      _next_keep='DELETE';
    end;
    else if start_date eq _next_start and
            value eq 'DAY' and
            _next_value eq 'NIGHT' then do;
      if 0<=(start_date-_last_kept_date)<=1 then _pair+1;
      else _pair=1;
      pair=_pair;
      action='KEEP';
      output;
      _next_keep='KEEP';
    end;
    else do;
      action='DELETE';
      output;
      _next_keep='DELETE';
    end;
  end;
  else if _next_keep eq 'KEEP' then do;
    action='KEEP';
    pair=_pair;
    output;
    _next_keep='DELETE';
  end;
run;

HTH,

Art, CEO, AnalystFinder.com

 

mkeintz
PROC Star

This code produces what you want, includng the reset to pair=1 when a pair is more than one day after the most preceding pair.

 

HOWEVER, it appears your sample answer is wrong for the 24-July-2013 pair, which you have assigned as pair=3.  But the preceding pair is 1 week prior (17-JUL-13), so the 24-jul pair should be pair=1.

 

data want (drop=p nxt_val p_membr pair_date);

  set have (keep=id start_date);
  by id start_date;

  retain p 0       /* to track pair number */
         p_membr . /* 1=beginning of pair, 2=end of a pair */
         pair_date /* Date of most recent pair */

  if first.start_date=1 then do;
    if first.id then p=0;
    if pair_date < start_date-1 then p=0;
  end;

  merge have 
        have (firstobs=2 keep=id value rename=(value=nxt_val));

  /* If retain p_membr=1, this one must be p_membr=2*/
  if p_membr=1 then do;
    pair=p;
    p_membr=2;
  end;
  else if last.start_date=0 and catx('/',value,nxt_val) in ('DAY/NIGHT','NIGHT/DAY') then do;
    p_membr=1;
    pair_date=start_date;
    p=p+1;
    pair=p;
  end;
  else p_membr=.;
    
  if p_membr=. then action='DELETE';
  else action='KEEP';

run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
WAL83
Obsidian | Level 7

Aye, yes you are correct. Long day 😞

 

Your code does work very well although it renumbers the last pair id to the next id.. See example output dataset below...

 

ID Start_Date Start_Time End_Time Value pair action
1024 22-Feb-14 6:00:00 21:59:00 DAY 1 KEEP
1024 22-Feb-14 22:00:00 5:59:00 NIGHT 1 KEEP
1024 23-Feb-14 6:00:00 21:59:00 DAY 2 KEEP
1024 23-Feb-14 22:00:00 5:59:00 NIGHT 2 KEEP
1024 24-Feb-14 6:00:00 21:59:00 DAY 3 KEEP
1040 24-Feb-14 22:00:00 5:59:00 NIGHT 3 KEEP
1040 3-Apr-14 6:00:00 21:59:00 DAY 1 KEEP
1040 3-Apr-14 22:00:00 5:59:00 NIGHT 1 KEEP
1040 4-Apr-14 22:00:00 5:59:00 NIGHT . DELETE
1040 5-Apr-14 6:00:00 21:59:00 DAY 2 KEEP
1040 5-Apr-14 22:00:00 5:59:00 NIGHT 2 KEEP
1040 6-Apr-14 22:00:00 5:59:00 NIGHT . DELETE
1040 7-Apr-14 6:00:00 21:59:00 DAY 3 KEEP
2003 7-Apr-14 22:00:00 5:59:00 NIGHT 3 KEEP
2003 1-Jul-13 6:00:00 21:59:00 DAY 1 KEEP
2003 1-Jul-13 22:00:00 5:59:00 NIGHT 1 KEEP
2003 2-Jul-13 22:00:00 5:59:00 NIGHT . DELETE
2003 3-Jul-13 6:00:00 21:59:00 DAY 2 KEEP
2003 3-Jul-13 22:00:00 5:59:00 NIGHT 2 KEEP
2003 4-Jul-13 6:00:00 21:59:00 DAY 3 KEEP
2003 4-Jul-13 22:00:00 5:59:00 NIGHT 3 KEEP
2010 5-Jul-13 6:00:00 21:59:00 DAY . DELETE
2010 2-Jun-13 6:01:00 21:59:00 DAY 1 KEEP
2010 2-Jun-13 22:00:00 6:00:00 NIGHT 1 KEEP
2010 16-Jul-13 6:01:00 21:59:00 DAY 1 KEEP
2010 16-Jul-13 22:00:00 6:00:00 NIGHT 1 KEEP
2010 17-Jul-13 6:01:00 21:59:00 DAY 2 KEEP
2010 17-Jul-13 22:00:00 6:00:00 NIGHT 2 KEEP
2010 24-Jul-13 6:01:00 21:59:00 DAY 1 KEEP
2010 24-Jul-13 22:00:00 6:00:00 NIGHT 1 KEEP
2010 25-Jul-13 22:00:00 6:00:00 NIGHT . DELETE
2010 26-Jul-13 22:00:00 6:00:00 NIGHT . DELETE
2010 28-Jul-13 22:00:00 6:00:00 NIGHT . DELETE
2010 29-Jul-13 6:01:00 21:59:00 DAY 1 KEEP
2010 29-Jul-13 22:00:00 6:00:00 NIGHT 1 KEEP
2010 1-Aug-13 6:01:00 21:59:00 DAY . DELETE
2010 9-Aug-13 6:01:00 21:59:00 DAY 1 KEEP
2010 9-Aug-13 22:00:00 6:00:00 NIGHT 1 KEEP
2010 10-Aug-13 6:01:00 21:59:00 DAY 2 KEEP
2010 10-Aug-13 22:00:00 6:00:00 NIGHT 2 KEEP
2010 11-Aug-13 22:00:00 6:00:00 NIGHT . DELETE
2010 12-Aug-13 22:00:00 6:00:00 NIGHT . DELETE
2010 1-Mar-14 22:00:00 5:59:00 NIGHT . DELETE
2010 2-Mar-14 22:00:00 5:59:00 NIGHT . DELETE
2010 3-Mar-14 22:00:00 5:59:00 NIGHT . DELETE
2010 5-Mar-14 6:00:00 21:59:00 DAY 1 KEEP
2010 5-Mar-14 22:00:00 5:59:00 NIGHT 1 KEEP
2010 6-Mar-14 6:00:00 21:59:00 DAY 2 KEEP
2010 6-Mar-14 22:00:00 5:59:00 NIGHT 2 KEEP
art297
Opal | Level 21

Did you try my revised code?

 

Art, CEO, AnalystFinder.com

WAL83
Obsidian | Level 7

Yes. It works except for the resetting of pair = 1 when non-contiguous date occurs.

art297
Opal | Level 21

Not the code I posted about a half hour ago .. I don't think!

 

Art, CEO, AnalystFinder.com

 

WAL83
Obsidian | Level 7

I think it's working. I'm double checking. I'm getting tired 😞

mkeintz
PROC Star

Please mark @art297's message with the working code as the solution, not his request for you to test it.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 18 replies
  • 1565 views
  • 5 likes
  • 4 in conversation