SAS Programming

DATA Step, Macro, Functions and more
BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
KermitTheFrog
Fluorite | Level 6

I need to consolidate observations based on consecutive validity ranges.

Here is a reproducible minimal example:

data have;
infile datalines4 delimiter=",";
input id $ start :date9. end :date9. var1 var2 var3;
format start end date9.;
datalines4;
ABC1,01DEC2020,21FEB2021,1,0,1
ABC1,22FEB2021,10MAR2021,1,0,1
ABC1,11MAR2021,31DEC9999,1,1,1
DEF2,11DEC2020,10JAN2021,0,0,0
DEF2,11JAN2021,19JAN2021,1,1,1
DEF2,20JAN2021,02FEB2021,1,1,1
DEF2,03FEB2021,10MAR2025,1,0,1
;;;;
quit;

For each id, if the validity dates are consecutive (i.e. end_date + 1day of the n-1 observation is equal to the start_date of the n observation) and if they share the same value for all three variables (var1, var2 and var3), I want to consolidate those multiple observations into a single one.

What I would like to have:

 

KermitTheFrog_0-1619185644692.png

 

Any kind of approach (array/hash/sql) would be appreciated !

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data HAVE;
  infile datalines4 delimiter=",";
  input ID $ START :date9. END :date9. VAR1 VAR2 VAR3;
  format START END date9.;
datalines4;
ABC1,01DEC2020,21FEB2021,1,0,1
ABC1,22FEB2021,10MAR2021,1,0,1
ABC1,11MAR2021,31DEC9999,1,1,1
DEF2,11DEC2020,10JAN2021,0,0,0
DEF2,11JAN2021,19JAN2021,1,1,1
DEF2,20JAN2021,01FEB2021,1,1,1
DEF2,02FEB2021,02FEB2021,1,1,1
DEF2,03FEB2021,10MAR2025,1,0,1
;;;;
run;

data temp;
 set have;
 do date=start to end;
  output;
 end;
 drop start end;
 format date date9.;
run;
proc sort data=temp out=temp1 nodupkey;
by id VAR1 VAR2 VAR3 date;
run;
data temp2;
 set temp1;
 by id VAR1 VAR2 VAR3;
 if first.var3 or dif(date) ne 1 then group+1;
run;
proc summary data=temp2;
by  id VAR1 VAR2 VAR3;
var date;
output out=want min=start max=end;
run;

View solution in original post

2 REPLIES 2
ChrisNZ
Tourmaline | Level 20

Like this?

data HAVE;
  infile datalines4 delimiter=",";
  input ID $ START :date9. END :date9. VAR1 VAR2 VAR3;
  format START END date9.;
datalines4;
ABC1,01DEC2020,21FEB2021,1,0,1
ABC1,22FEB2021,10MAR2021,1,0,1
ABC1,11MAR2021,31DEC9999,1,1,1
DEF2,11DEC2020,10JAN2021,0,0,0
DEF2,11JAN2021,19JAN2021,1,1,1
DEF2,20JAN2021,01FEB2021,1,1,1
DEF2,02FEB2021,02FEB2021,1,1,1
DEF2,03FEB2021,10MAR2025,1,0,1
run;

data WANT;
  set HAVE end=LASTOBS;
  if ^LASTOBS then set HAVE 
    (firstobs=2
     rename=(ID=_ID START=_START END=_END VAR1=_VAR1 VAR2=_VAR2 VAR3=_VAR3)) ;
  _SAMEVALUES= ( ID=_ID & VAR1=_VAR1 & VAR2=_VAR2 & VAR3=_VAR3);
  if LASTOBS | ^_SAMEVALUES then do;
    START=coalesce(_S,START);
    output;
    _S=.;
  end;
  else if ^_S then _S=START;   
  retain _S;   
  drop  _:;
run;

ID START END VAR1 VAR2 VAR3
ABC1 01DEC2020 10MAR2021 1 0 1
ABC1 11MAR2021 31DEC9999 1 1 1
DEF2 11DEC2020 10JAN2021 0 0 0
DEF2 11JAN2021 02FEB2021 1 1 1
DEF2 03FEB2021 10MAR2025 1 0 1

 

Ksharp
Super User
data HAVE;
  infile datalines4 delimiter=",";
  input ID $ START :date9. END :date9. VAR1 VAR2 VAR3;
  format START END date9.;
datalines4;
ABC1,01DEC2020,21FEB2021,1,0,1
ABC1,22FEB2021,10MAR2021,1,0,1
ABC1,11MAR2021,31DEC9999,1,1,1
DEF2,11DEC2020,10JAN2021,0,0,0
DEF2,11JAN2021,19JAN2021,1,1,1
DEF2,20JAN2021,01FEB2021,1,1,1
DEF2,02FEB2021,02FEB2021,1,1,1
DEF2,03FEB2021,10MAR2025,1,0,1
;;;;
run;

data temp;
 set have;
 do date=start to end;
  output;
 end;
 drop start end;
 format date date9.;
run;
proc sort data=temp out=temp1 nodupkey;
by id VAR1 VAR2 VAR3 date;
run;
data temp2;
 set temp1;
 by id VAR1 VAR2 VAR3;
 if first.var3 or dif(date) ne 1 then group+1;
run;
proc summary data=temp2;
by  id VAR1 VAR2 VAR3;
var date;
output out=want min=start max=end;
run;

sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 708 views
  • 0 likes
  • 3 in conversation