SAS Programming

DATA Step, Macro, Functions and more
BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
KermitTheFrog
Fluorite | Level 6

I need to consolidate observations based on consecutive validity ranges.

Here is a reproducible minimal example:

data have;
infile datalines4 delimiter=",";
input id $ start :date9. end :date9. var1 var2 var3;
format start end date9.;
datalines4;
ABC1,01DEC2020,21FEB2021,1,0,1
ABC1,22FEB2021,10MAR2021,1,0,1
ABC1,11MAR2021,31DEC9999,1,1,1
DEF2,11DEC2020,10JAN2021,0,0,0
DEF2,11JAN2021,19JAN2021,1,1,1
DEF2,20JAN2021,02FEB2021,1,1,1
DEF2,03FEB2021,10MAR2025,1,0,1
;;;;
quit;

For each id, if the validity dates are consecutive (i.e. end_date + 1day of the n-1 observation is equal to the start_date of the n observation) and if they share the same value for all three variables (var1, var2 and var3), I want to consolidate those multiple observations into a single one.

What I would like to have:

 

KermitTheFrog_0-1619185644692.png

 

Any kind of approach (array/hash/sql) would be appreciated !

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data HAVE;
  infile datalines4 delimiter=",";
  input ID $ START :date9. END :date9. VAR1 VAR2 VAR3;
  format START END date9.;
datalines4;
ABC1,01DEC2020,21FEB2021,1,0,1
ABC1,22FEB2021,10MAR2021,1,0,1
ABC1,11MAR2021,31DEC9999,1,1,1
DEF2,11DEC2020,10JAN2021,0,0,0
DEF2,11JAN2021,19JAN2021,1,1,1
DEF2,20JAN2021,01FEB2021,1,1,1
DEF2,02FEB2021,02FEB2021,1,1,1
DEF2,03FEB2021,10MAR2025,1,0,1
;;;;
run;

data temp;
 set have;
 do date=start to end;
  output;
 end;
 drop start end;
 format date date9.;
run;
proc sort data=temp out=temp1 nodupkey;
by id VAR1 VAR2 VAR3 date;
run;
data temp2;
 set temp1;
 by id VAR1 VAR2 VAR3;
 if first.var3 or dif(date) ne 1 then group+1;
run;
proc summary data=temp2;
by  id VAR1 VAR2 VAR3;
var date;
output out=want min=start max=end;
run;

View solution in original post

2 REPLIES 2
ChrisNZ
Tourmaline | Level 20

Like this?

data HAVE;
  infile datalines4 delimiter=",";
  input ID $ START :date9. END :date9. VAR1 VAR2 VAR3;
  format START END date9.;
datalines4;
ABC1,01DEC2020,21FEB2021,1,0,1
ABC1,22FEB2021,10MAR2021,1,0,1
ABC1,11MAR2021,31DEC9999,1,1,1
DEF2,11DEC2020,10JAN2021,0,0,0
DEF2,11JAN2021,19JAN2021,1,1,1
DEF2,20JAN2021,01FEB2021,1,1,1
DEF2,02FEB2021,02FEB2021,1,1,1
DEF2,03FEB2021,10MAR2025,1,0,1
run;

data WANT;
  set HAVE end=LASTOBS;
  if ^LASTOBS then set HAVE 
    (firstobs=2
     rename=(ID=_ID START=_START END=_END VAR1=_VAR1 VAR2=_VAR2 VAR3=_VAR3)) ;
  _SAMEVALUES= ( ID=_ID & VAR1=_VAR1 & VAR2=_VAR2 & VAR3=_VAR3);
  if LASTOBS | ^_SAMEVALUES then do;
    START=coalesce(_S,START);
    output;
    _S=.;
  end;
  else if ^_S then _S=START;   
  retain _S;   
  drop  _:;
run;

ID START END VAR1 VAR2 VAR3
ABC1 01DEC2020 10MAR2021 1 0 1
ABC1 11MAR2021 31DEC9999 1 1 1
DEF2 11DEC2020 10JAN2021 0 0 0
DEF2 11JAN2021 02FEB2021 1 1 1
DEF2 03FEB2021 10MAR2025 1 0 1

 

Ksharp
Super User
data HAVE;
  infile datalines4 delimiter=",";
  input ID $ START :date9. END :date9. VAR1 VAR2 VAR3;
  format START END date9.;
datalines4;
ABC1,01DEC2020,21FEB2021,1,0,1
ABC1,22FEB2021,10MAR2021,1,0,1
ABC1,11MAR2021,31DEC9999,1,1,1
DEF2,11DEC2020,10JAN2021,0,0,0
DEF2,11JAN2021,19JAN2021,1,1,1
DEF2,20JAN2021,01FEB2021,1,1,1
DEF2,02FEB2021,02FEB2021,1,1,1
DEF2,03FEB2021,10MAR2025,1,0,1
;;;;
run;

data temp;
 set have;
 do date=start to end;
  output;
 end;
 drop start end;
 format date date9.;
run;
proc sort data=temp out=temp1 nodupkey;
by id VAR1 VAR2 VAR3 date;
run;
data temp2;
 set temp1;
 by id VAR1 VAR2 VAR3;
 if first.var3 or dif(date) ne 1 then group+1;
run;
proc summary data=temp2;
by  id VAR1 VAR2 VAR3;
var date;
output out=want min=start max=end;
run;

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 551 views
  • 0 likes
  • 3 in conversation