<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Consolidate multiple observations based on validity ranges in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Consolidate-multiple-observations-based-on-validity-ranges/m-p/736589#M229497</link>
    <description>&lt;P&gt;I need to consolidate observations based on consecutive validity ranges.&lt;/P&gt;&lt;P&gt;Here is a reproducible minimal example:&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
infile datalines4 delimiter=",";
input id $ start :date9. end :date9. var1 var2 var3;
format start end date9.;
datalines4;
ABC1,01DEC2020,21FEB2021,1,0,1
ABC1,22FEB2021,10MAR2021,1,0,1
ABC1,11MAR2021,31DEC9999,1,1,1
DEF2,11DEC2020,10JAN2021,0,0,0
DEF2,11JAN2021,19JAN2021,1,1,1
DEF2,20JAN2021,02FEB2021,1,1,1
DEF2,03FEB2021,10MAR2025,1,0,1
;;;;
quit;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;For each id, if the validity dates are consecutive (i.e.&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;end_date + 1day&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;of the&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;n-1&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;observation is equal to the&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;start_date&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;of the&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;n&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;observation) and if they share the same value for all three variables (var1, var2 and var3), I want to consolidate those multiple observations into a single one.&lt;/P&gt;&lt;P&gt;What I would like to have:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="KermitTheFrog_0-1619185644692.png" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/58631i4B7AC2FA8CF22A01/image-size/medium?v=v2&amp;amp;px=400" role="button" title="KermitTheFrog_0-1619185644692.png" alt="KermitTheFrog_0-1619185644692.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Any kind of approach (array/hash/sql) would be appreciated !&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Fri, 23 Apr 2021 13:47:43 GMT</pubDate>
    <dc:creator>KermitTheFrog</dc:creator>
    <dc:date>2021-04-23T13:47:43Z</dc:date>
    <item>
      <title>Consolidate multiple observations based on validity ranges</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Consolidate-multiple-observations-based-on-validity-ranges/m-p/736589#M229497</link>
      <description>&lt;P&gt;I need to consolidate observations based on consecutive validity ranges.&lt;/P&gt;&lt;P&gt;Here is a reproducible minimal example:&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
infile datalines4 delimiter=",";
input id $ start :date9. end :date9. var1 var2 var3;
format start end date9.;
datalines4;
ABC1,01DEC2020,21FEB2021,1,0,1
ABC1,22FEB2021,10MAR2021,1,0,1
ABC1,11MAR2021,31DEC9999,1,1,1
DEF2,11DEC2020,10JAN2021,0,0,0
DEF2,11JAN2021,19JAN2021,1,1,1
DEF2,20JAN2021,02FEB2021,1,1,1
DEF2,03FEB2021,10MAR2025,1,0,1
;;;;
quit;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;For each id, if the validity dates are consecutive (i.e.&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;end_date + 1day&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;of the&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;n-1&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;observation is equal to the&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;start_date&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;of the&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;n&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;observation) and if they share the same value for all three variables (var1, var2 and var3), I want to consolidate those multiple observations into a single one.&lt;/P&gt;&lt;P&gt;What I would like to have:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="KermitTheFrog_0-1619185644692.png" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/58631i4B7AC2FA8CF22A01/image-size/medium?v=v2&amp;amp;px=400" role="button" title="KermitTheFrog_0-1619185644692.png" alt="KermitTheFrog_0-1619185644692.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Any kind of approach (array/hash/sql) would be appreciated !&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 23 Apr 2021 13:47:43 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Consolidate-multiple-observations-based-on-validity-ranges/m-p/736589#M229497</guid>
      <dc:creator>KermitTheFrog</dc:creator>
      <dc:date>2021-04-23T13:47:43Z</dc:date>
    </item>
    <item>
      <title>Re: Consolidate multiple observations based on validity ranges</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Consolidate-multiple-observations-based-on-validity-ranges/m-p/736705#M229555</link>
      <description>&lt;P&gt;Like this?&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data HAVE;
  infile datalines4 delimiter=",";
  input ID $ START :date9. END :date9. VAR1 VAR2 VAR3;
  format START END date9.;
datalines4;
ABC1,01DEC2020,21FEB2021,1,0,1
ABC1,22FEB2021,10MAR2021,1,0,1
ABC1,11MAR2021,31DEC9999,1,1,1
DEF2,11DEC2020,10JAN2021,0,0,0
DEF2,11JAN2021,19JAN2021,1,1,1
DEF2,20JAN2021,01FEB2021,1,1,1
DEF2,02FEB2021,02FEB2021,1,1,1
DEF2,03FEB2021,10MAR2025,1,0,1
run;

data WANT;
  set HAVE end=LASTOBS;
  if ^LASTOBS then set HAVE 
    (firstobs=2
     rename=(ID=_ID START=_START END=_END VAR1=_VAR1 VAR2=_VAR2 VAR3=_VAR3)) ;
  _SAMEVALUES= ( ID=_ID &amp;amp; VAR1=_VAR1 &amp;amp; VAR2=_VAR2 &amp;amp; VAR3=_VAR3);
  if LASTOBS | ^_SAMEVALUES then do;
    START=coalesce(_S,START);
    output;
    _S=.;
  end;
  else if ^_S then _S=START;   
  retain _S;   
  drop  _:;
run;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;DIV class="branch"&gt;&lt;BR /&gt;
&lt;DIV&gt;
&lt;DIV align="left"&gt;
&lt;TABLE class="table" summary="Procedure Print: Data Set WORK.WANT" frame="box" rules="all" cellspacing="0" cellpadding="5"&gt;&lt;COLGROUP&gt; &lt;COL /&gt; &lt;COL /&gt; &lt;COL /&gt; &lt;COL /&gt; &lt;COL /&gt; &lt;COL /&gt;&lt;/COLGROUP&gt;
&lt;THEAD&gt;
&lt;TR&gt;
&lt;TH class="l header" scope="col"&gt;ID&lt;/TH&gt;
&lt;TH class="r header" scope="col"&gt;START&lt;/TH&gt;
&lt;TH class="r header" scope="col"&gt;END&lt;/TH&gt;
&lt;TH class="r header" scope="col"&gt;VAR1&lt;/TH&gt;
&lt;TH class="r header" scope="col"&gt;VAR2&lt;/TH&gt;
&lt;TH class="r header" scope="col"&gt;VAR3&lt;/TH&gt;
&lt;/TR&gt;
&lt;/THEAD&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD class="l data"&gt;ABC1&lt;/TD&gt;
&lt;TD class="r data"&gt;01DEC2020&lt;/TD&gt;
&lt;TD class="r data"&gt;10MAR2021&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;0&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="l data"&gt;ABC1&lt;/TD&gt;
&lt;TD class="r data"&gt;11MAR2021&lt;/TD&gt;
&lt;TD class="r data"&gt;31DEC9999&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="l data"&gt;DEF2&lt;/TD&gt;
&lt;TD class="r data"&gt;11DEC2020&lt;/TD&gt;
&lt;TD class="r data"&gt;10JAN2021&lt;/TD&gt;
&lt;TD class="r data"&gt;0&lt;/TD&gt;
&lt;TD class="r data"&gt;0&lt;/TD&gt;
&lt;TD class="r data"&gt;0&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="l data"&gt;DEF2&lt;/TD&gt;
&lt;TD class="r data"&gt;11JAN2021&lt;/TD&gt;
&lt;TD class="r data"&gt;02FEB2021&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="l data"&gt;DEF2&lt;/TD&gt;
&lt;TD class="r data"&gt;03FEB2021&lt;/TD&gt;
&lt;TD class="r data"&gt;10MAR2025&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;0&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 24 Apr 2021 03:02:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Consolidate-multiple-observations-based-on-validity-ranges/m-p/736705#M229555</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2021-04-24T03:02:50Z</dc:date>
    </item>
    <item>
      <title>Re: Consolidate multiple observations based on validity ranges</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Consolidate-multiple-observations-based-on-validity-ranges/m-p/736736#M229576</link>
      <description>&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data HAVE;
  infile datalines4 delimiter=",";
  input ID $ START :date9. END :date9. VAR1 VAR2 VAR3;
  format START END date9.;
datalines4;
ABC1,01DEC2020,21FEB2021,1,0,1
ABC1,22FEB2021,10MAR2021,1,0,1
ABC1,11MAR2021,31DEC9999,1,1,1
DEF2,11DEC2020,10JAN2021,0,0,0
DEF2,11JAN2021,19JAN2021,1,1,1
DEF2,20JAN2021,01FEB2021,1,1,1
DEF2,02FEB2021,02FEB2021,1,1,1
DEF2,03FEB2021,10MAR2025,1,0,1
;;;;
run;

data temp;
 set have;
 do date=start to end;
  output;
 end;
 drop start end;
 format date date9.;
run;
proc sort data=temp out=temp1 nodupkey;
by id VAR1 VAR2 VAR3 date;
run;
data temp2;
 set temp1;
 by id VAR1 VAR2 VAR3;
 if first.var3 or dif(date) ne 1 then group+1;
run;
proc summary data=temp2;
by  id VAR1 VAR2 VAR3;
var date;
output out=want min=start max=end;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Sat, 24 Apr 2021 10:39:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Consolidate-multiple-observations-based-on-validity-ranges/m-p/736736#M229576</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2021-04-24T10:39:12Z</dc:date>
    </item>
  </channel>
</rss>

