<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Selecting only the record with the smallest difference between dates in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Selecting-only-the-record-with-the-smallest-difference-between/m-p/743932#M233002</link>
    <description>&lt;P&gt;Thank you for your solutions! The issue has been solved, this has been very helpful.&lt;/P&gt;</description>
    <pubDate>Wed, 26 May 2021 16:00:29 GMT</pubDate>
    <dc:creator>twerwath</dc:creator>
    <dc:date>2021-05-26T16:00:29Z</dc:date>
    <item>
      <title>Selecting only the record with the smallest difference between dates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Selecting-only-the-record-with-the-smallest-difference-between/m-p/743896#M232991</link>
      <description>&lt;P&gt;I am working on a situation in which a specific event has occurred for an individual on a given date, and that person has multiple events and multiple personnel records. I want to keep only one record per event, the one closest to the event date (either before or after). I have an idea of how to do this, but am looking for the most efficient possible way since I am working with very large datasets.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am also not sure what to do when there are two records that are equally close to the event date. In this case, either one could be used, but only one.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;An example of what I have:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;data have;&lt;BR /&gt;input id event_date record_date&lt;/P&gt;&lt;P&gt;datalines;&lt;BR /&gt;1 28JAN2019 24JAN2019.&amp;nbsp;&lt;BR /&gt;1 28JAN2019 26JAN2019.&amp;nbsp;&lt;/P&gt;&lt;P&gt;1 28JAN2019 3FEB2019.&amp;nbsp;&lt;/P&gt;&lt;P&gt;1 28JAN2019 4FEB2019.&amp;nbsp;&lt;/P&gt;&lt;P&gt;1 3FEB2019 24JAN2019.&amp;nbsp;&lt;BR /&gt;1 3FEB2019 26JAN2019.&amp;nbsp;&lt;/P&gt;&lt;P&gt;1 3FEB2019 3FEB2019.&amp;nbsp;&lt;/P&gt;&lt;P&gt;1 3FEB2019 4FEB2019.&amp;nbsp;&lt;/P&gt;&lt;P&gt;2 31JAN2019 28JAN2019.&amp;nbsp;&lt;BR /&gt;2 31JAN2019 1FEB2019.&amp;nbsp;&lt;/P&gt;&lt;P&gt;2 31JAN2019 3FEB2019.&amp;nbsp;&lt;/P&gt;&lt;P&gt;2 31JAN2019 4FEB2019.&amp;nbsp;&lt;/P&gt;&lt;P&gt;2 5FEB2019 28JAN2019.&amp;nbsp;&lt;BR /&gt;2 5FEB2019 1FEB2019.&amp;nbsp;&lt;/P&gt;&lt;P&gt;2 5FEB2019 3FEB2019.&amp;nbsp;&lt;/P&gt;&lt;P&gt;2 5FEB2019 4FEB2019.&amp;nbsp;&lt;/P&gt;&lt;P&gt;;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;An example of what I want:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;data want;&lt;BR /&gt;input id event_date record_date&lt;/P&gt;&lt;P&gt;datalines;&lt;BR /&gt;1 28JAN2019 26JAN2019.&amp;nbsp;&lt;/P&gt;&lt;P&gt;1 3FEB2019 3FEB2019.&amp;nbsp;&lt;/P&gt;&lt;P&gt;2 31JAN2019 1FEB2019.&amp;nbsp;&lt;/P&gt;&lt;P&gt;2 5FEB2019 4FEB2019.&amp;nbsp;&lt;/P&gt;&lt;P&gt;;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks in advance for any help with this!&lt;/P&gt;</description>
      <pubDate>Wed, 26 May 2021 14:42:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Selecting-only-the-record-with-the-smallest-difference-between/m-p/743896#M232991</guid>
      <dc:creator>twerwath</dc:creator>
      <dc:date>2021-05-26T14:42:29Z</dc:date>
    </item>
    <item>
      <title>Re: Selecting only the record with the smallest difference between dates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Selecting-only-the-record-with-the-smallest-difference-between/m-p/743907#M232992</link>
      <description>&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
input id (event_date record_date) (:date9.);
format event_date record_date date9.;
datalines;
1 28JAN2019 24JAN2019. 
1 28JAN2019 26JAN2019. 
1 28JAN2019 3FEB2019. 
1 28JAN2019 4FEB2019. 
1 3FEB2019 24JAN2019. 
1 3FEB2019 26JAN2019. 
1 3FEB2019 3FEB2019. 
1 3FEB2019 4FEB2019. 
2 31JAN2019 28JAN2019. 
2 31JAN2019 1FEB2019. 
2 31JAN2019 3FEB2019. 
2 31JAN2019 4FEB2019. 
2 5FEB2019 28JAN2019. 
2 5FEB2019 1FEB2019. 
2 5FEB2019 6FEB2019.  /*added one more to include same diff*/
2 5FEB2019 3FEB2019. 
2 5FEB2019 4FEB2019. 
;
run;


proc sql;
 create table want(drop=diff) as
 select distinct *
 from
 (select *, abs(event_date-record_date) as diff
 from have
 group by id,event_date
 having min(diff)=diff)
 group by id,event_date,diff
 having min(record_date)=record_date;
quit;

 


&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 26 May 2021 14:57:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Selecting-only-the-record-with-the-smallest-difference-between/m-p/743907#M232992</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2021-05-26T14:57:22Z</dc:date>
    </item>
    <item>
      <title>Re: Selecting only the record with the smallest difference between dates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Selecting-only-the-record-with-the-smallest-difference-between/m-p/743913#M232996</link>
      <description>&lt;P&gt;If the dataset is sorted (or at least grouped) by id and event_date, do this:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want;
dist = 99999;
do until (last.event_date);
  set have;
  by id event_date;
  if abs(event_date - record_date) lt dist
  then do;
    dist = abs(event_date - record_date);
    rd = record_date;
  end;
end;
record_date = rd;
drop rd;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 26 May 2021 15:09:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Selecting-only-the-record-with-the-smallest-difference-between/m-p/743913#M232996</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2021-05-26T15:09:37Z</dc:date>
    </item>
    <item>
      <title>Re: Selecting only the record with the smallest difference between dates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Selecting-only-the-record-with-the-smallest-difference-between/m-p/743916#M232997</link>
      <description>&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
input id (event_date record_date) (:date9.);
format event_date record_date date9.;
datalines;
1 28JAN2019 24JAN2019. 
1 28JAN2019 26JAN2019. 
1 28JAN2019 3FEB2019. 
1 28JAN2019 4FEB2019. 
1 3FEB2019 24JAN2019. 
1 3FEB2019 26JAN2019. 
1 3FEB2019 3FEB2019. 
1 3FEB2019 4FEB2019. 
2 31JAN2019 28JAN2019. 
2 31JAN2019 1FEB2019. 
2 31JAN2019 3FEB2019. 
2 31JAN2019 4FEB2019. 
2 5FEB2019 28JAN2019. 
2 5FEB2019 1FEB2019. 
2 5FEB2019 3FEB2019. 
2 5FEB2019 4FEB2019.
2 5FEB2019 6FEB2019.  /*added one more to include same diff*/ 
;
run;

data want;
 do _n_=1 by 1 until(last.event_date);
  set have;
  by id event_date;
  diff=abs(event_date-record_date);
  if min&amp;gt;. and diff&amp;gt;=min then continue;
  min=diff;
  k=_n_;
 end;
 do _n_=1 to _n_;
  set have;
  if _n_=k then output;
 end;
 drop k min diff;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 26 May 2021 15:10:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Selecting-only-the-record-with-the-smallest-difference-between/m-p/743916#M232997</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2021-05-26T15:10:56Z</dc:date>
    </item>
    <item>
      <title>Re: Selecting only the record with the smallest difference between dates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Selecting-only-the-record-with-the-smallest-difference-between/m-p/743919#M232998</link>
      <description>&lt;P&gt;Well if your version support CUROBS option in set statement, you could get a little aggressive rather than defensive -&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
input id (event_date record_date) (:date9.);
format event_date record_date date9.;
datalines;
1 28JAN2019 24JAN2019. 
1 28JAN2019 26JAN2019. 
1 28JAN2019 3FEB2019. 
1 28JAN2019 4FEB2019. 
1 3FEB2019 24JAN2019. 
1 3FEB2019 26JAN2019. 
1 3FEB2019 3FEB2019. 
1 3FEB2019 4FEB2019. 
2 31JAN2019 28JAN2019. 
2 31JAN2019 1FEB2019. 
2 31JAN2019 3FEB2019. 
2 31JAN2019 4FEB2019. 
2 5FEB2019 28JAN2019. 
2 5FEB2019 1FEB2019. 
2 5FEB2019 3FEB2019. 
2 5FEB2019 4FEB2019.
2 5FEB2019 6FEB2019.  /*added one more to include same diff*/ 
;
run;

data want;
 do until(last.event_date);
  set have curobs=_n_;
  by id event_date;
  diff=abs(event_date-record_date);
  if min&amp;gt;. and diff&amp;gt;=min then continue;
  min=diff;
  k=_n_;
 end;
 set have point=k;
 drop k min diff;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 26 May 2021 15:16:55 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Selecting-only-the-record-with-the-smallest-difference-between/m-p/743919#M232998</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2021-05-26T15:16:55Z</dc:date>
    </item>
    <item>
      <title>Re: Selecting only the record with the smallest difference between dates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Selecting-only-the-record-with-the-smallest-difference-between/m-p/743924#M232999</link>
      <description>&lt;P&gt;You could even play with Hash contents for fun and to familiarize ways to park and pick object notations of hash methods-&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
input id (event_date record_date) (:date9.);
format event_date record_date date9.;
datalines;
1 28JAN2019 24JAN2019. 
1 28JAN2019 26JAN2019. 
1 28JAN2019 3FEB2019. 
1 28JAN2019 4FEB2019. 
1 3FEB2019 24JAN2019. 
1 3FEB2019 26JAN2019. 
1 3FEB2019 3FEB2019. 
1 3FEB2019 4FEB2019. 
2 31JAN2019 28JAN2019. 
2 31JAN2019 1FEB2019. 
2 31JAN2019 3FEB2019. 
2 31JAN2019 4FEB2019. 
2 5FEB2019 28JAN2019. 
2 5FEB2019 1FEB2019. 
2 5FEB2019 3FEB2019. 
2 5FEB2019 4FEB2019.
2 5FEB2019 6FEB2019.  /*added one more to include same diff*/ 
;
run;

data want;
 if _n_=1 then do;
  dcl hash h(dataset:'have(obs=0)');
  h.definekey('id');
  h.definedata(all:'y');
  h.definedone();
 end;
 do _n_=h.clear() by 0 until(last.event_date);
  set have;
  by id event_date;
  diff=abs(event_date-record_date);
  if min&amp;gt;. and diff&amp;gt;=min then continue;
  min=diff;
  h.replace();
 end;
 h.find();
 drop min diff;
run;
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 26 May 2021 15:38:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Selecting-only-the-record-with-the-smallest-difference-between/m-p/743924#M232999</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2021-05-26T15:38:33Z</dc:date>
    </item>
    <item>
      <title>Re: Selecting only the record with the smallest difference between dates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Selecting-only-the-record-with-the-smallest-difference-between/m-p/743932#M233002</link>
      <description>&lt;P&gt;Thank you for your solutions! The issue has been solved, this has been very helpful.&lt;/P&gt;</description>
      <pubDate>Wed, 26 May 2021 16:00:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Selecting-only-the-record-with-the-smallest-difference-between/m-p/743932#M233002</guid>
      <dc:creator>twerwath</dc:creator>
      <dc:date>2021-05-26T16:00:29Z</dc:date>
    </item>
    <item>
      <title>Re: Selecting only the record with the smallest difference between dates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Selecting-only-the-record-with-the-smallest-difference-between/m-p/743951#M233012</link>
      <description>&lt;P&gt;This program assume data are sorted by ID/EVENT_DATE/RECORD_DATE:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Once the data step has reached either then end of an ID/EVENT_DATE group, or has a RECORD_DATE&amp;gt;EVENT_DATE, ;you merely have to decide whether the current record is closer than the prior record.&amp;nbsp; If the prior record is close, re-read it.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This allows a way to read in ALL the variables that are associated with the closest record_date.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want (drop=_:);
  set have ;
  by id event_date;

  _prior_date=ifn(first.event_date=1,'15oct1582'd,lag(record_date));

  /* Find obs after which no future obs can qualify ... */
  if ((_prior_date&amp;lt;event_date) and 
      (event_date&amp;lt;=record_date or last.event_date))
     or
     (first.event_date=1 and event_date&amp;lt;=record_date);

  /* Decide whether to retrieve prior record */
  if (event_date-_prior_date)&amp;lt;abs(record_date-event_date) then do;
    p=_n_-1;
    set have point=p;
  end;
run;


&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;The _prior_date var is set to 15oct1582&amp;nbsp; (beginning of Gregorian calendar) every time a new event_date is encountered.&amp;nbsp; This will prevent the program from reaching back to the prior event.&amp;nbsp;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 26 May 2021 16:43:42 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Selecting-only-the-record-with-the-smallest-difference-between/m-p/743951#M233012</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2021-05-26T16:43:42Z</dc:date>
    </item>
  </channel>
</rss>

