<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Merging two databases based on fuzzy dates (similar dates) in SAS Data Management</title>
    <link>https://communities.sas.com/t5/SAS-Data-Management/Merging-two-databases-based-on-fuzzy-dates-similar-dates/m-p/405674#M12379</link>
    <description>&lt;P&gt;First, thank you for your help. For the most part, this has been working, with the exception of what you had mentioned, when two observations (with the same SN) appear in table a, but only once in table b, they will both be included in table want, but both will&amp;nbsp;correspond to the same dates and variables as the one case from table b&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;One additional question, in the following line:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE class=" language-sas"&gt;&lt;CODE class="  language-sas"&gt;      having &lt;SPAN class="token function"&gt;min&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token function"&gt;abs&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;(&lt;/SPAN&gt;a&lt;SPAN class="token punctuation"&gt;.&lt;/SPAN&gt;start&lt;SPAN class="token operator"&gt;-&lt;/SPAN&gt;b&lt;SPAN class="token punctuation"&gt;.&lt;/SPAN&gt;start&lt;SPAN class="token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN class="token operator"&gt;+&lt;/SPAN&gt;&lt;SPAN class="token function"&gt;abs&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;(&lt;/SPAN&gt;a&lt;SPAN class="token punctuation"&gt;.&lt;/SPAN&gt;end&lt;SPAN class="token operator"&gt;-&lt;/SPAN&gt;b&lt;SPAN class="token punctuation"&gt;.&lt;/SPAN&gt;end&lt;SPAN class="token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;)&lt;/SPAN&gt; &lt;SPAN class="token operator"&gt;=&lt;/SPAN&gt; &lt;SPAN class="token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token function"&gt;abs&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;(&lt;/SPAN&gt;a&lt;SPAN class="token punctuation"&gt;.&lt;/SPAN&gt;start&lt;SPAN class="token operator"&gt;-&lt;/SPAN&gt;b&lt;SPAN class="token punctuation"&gt;.&lt;/SPAN&gt;start&lt;SPAN class="token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN class="token operator"&gt;+&lt;/SPAN&gt;&lt;SPAN class="token function"&gt;abs&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;(&lt;/SPAN&gt;a&lt;SPAN class="token punctuation"&gt;.&lt;/SPAN&gt;end&lt;SPAN class="token operator"&gt;-&lt;/SPAN&gt;b&lt;SPAN class="token punctuation"&gt;.&lt;/SPAN&gt;end&lt;SPAN class="token punctuation"&gt;))&lt;/SPAN&gt;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;what is the purpose of the syntax after the equals (=) sign?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Many thanks!&lt;/P&gt;</description>
    <pubDate>Thu, 19 Oct 2017 17:53:32 GMT</pubDate>
    <dc:creator>djohn051</dc:creator>
    <dc:date>2017-10-19T17:53:32Z</dc:date>
    <item>
      <title>Merging two databases based on fuzzy dates (similar dates)</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/Merging-two-databases-based-on-fuzzy-dates-similar-dates/m-p/404188#M12296</link>
      <description>&lt;P&gt;I will try to explain this the best that I can.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I currently have two databases: Database A and Database B - Both of these databases have the same people in them (id numbers), but the amount of observations is different (longfile). Database A features dates (start and end date) that I would like to keep. These dates are for a timeframe that was worked. Database B features some variables with details about the work&amp;nbsp;&lt;SPAN&gt;(Var_1 Var_2 Var_3)&lt;/SPAN&gt; and also features dates (start and end date), but these dates were tentative and the ones in Database A are the ones I want to keep.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;What I would like to do, is for each observation in Database A, try to identify the case from Database B with the most similar start and end date (for the same ID number) and then merge the line together, taking the variables about work details from Database B and the start and end date variables from Database A. Database A is to be used as the spine so for example:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Database A:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;ID1 01/01/2000 31/01/2000&lt;/LI&gt;&lt;LI&gt;ID1 25/02/2001&amp;nbsp;&lt;SPAN&gt;20/10/2001&lt;/SPAN&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;SPAN&gt;Database B:&lt;/SPAN&gt;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;ID1 03/01/2000 01/02/2000 &lt;SPAN&gt;Var_1 Var_2 Var_3&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;ID1 25/04/2000&amp;nbsp;&lt;SPAN&gt;28/05/2000 Var_1 Var_2 Var_3&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN&gt;ID1 25/07/2000&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;28/08/2000&lt;/SPAN&gt; &lt;SPAN&gt;Var_1 Var_2 Var_3&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;ID1 20/02/2001 17&lt;SPAN&gt;/10/2001&amp;nbsp;Var_1 Var_2 Var_3&lt;/SPAN&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;SPAN&gt;Merged Database:&lt;/SPAN&gt;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Rows 1 and 4 have the same ID# as in Database A and have the most similar dates. Dates from Database A are kept and var_1-var_3 are added. The dataset should look as follows:&lt;UL&gt;&lt;LI&gt;ID1 01/01/2000 31/01/2000&amp;nbsp;&lt;SPAN&gt;Var_1 Var_2 Var_3&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;ID1 25/02/2001&amp;nbsp;&lt;SPAN&gt;20/10/2001&amp;nbsp;Var_1 Var_2 Var_3&lt;/SPAN&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;/UL&gt;</description>
      <pubDate>Sat, 14 Oct 2017 03:29:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/Merging-two-databases-based-on-fuzzy-dates-similar-dates/m-p/404188#M12296</guid>
      <dc:creator>djohn051</dc:creator>
      <dc:date>2017-10-14T03:29:02Z</dc:date>
    </item>
    <item>
      <title>Re: Merging two databases based on fuzzy dates (similar dates)</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/Merging-two-databases-based-on-fuzzy-dates-similar-dates/m-p/404203#M12297</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/171133"&gt;@djohn051&lt;/a&gt;&lt;/P&gt;
&lt;P&gt;Great that you've provided sample data but please provide such sample data in the future via a data step as done below so we don't have to do this for you.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data a;
  input id $ start :ddmmyy10. end :ddmmyy10.;
  format start end date9.;
datalines;
ID1 01/01/2000 31/01/2000
ID1 25/02/2001 20/10/2001
;
run;

data b;
  input id $ start :ddmmyy10. end :ddmmyy10. (var1 var2 var3) ($);
  format start end date9.;
datalines;
ID1 03/01/2000 01/02/2000 Var_1 Var_2 Var_3
ID1 25/04/2000 28/05/2000 Var_1 Var_2 Var_3
ID1 25/07/2000 28/08/2000 Var_1 Var_2 Var_3
ID1 20/02/2001 17/10/2001 Var_1 Var_2 Var_3
;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;As for terminology: These are tables not databases. A database is the container for objects like tables so it's a level higher up.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;As for your question code as below could do:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;
  create table want as
    select 
      a.*,
      b.start as r_start format=date9.,
      b.end   as r_end format=date9.,
      b.var1,
      b.var2,
      b.var3
    from 
      a left join b
        on a.id=b.id
      group by a.id, a.start
      having min(abs(a.start-b.start)+abs(a.end-b.end)) = (abs(a.start-b.start)+abs(a.end-b.end))
    ;
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Please note: It is possible that more than one row from table B meets the criterion in the&amp;nbsp;&lt;EM&gt;having&lt;/EM&gt; clause so you might have to come up with additional filtering criteria to always get to a&amp;nbsp;single row match to table B.&lt;/P&gt;</description>
      <pubDate>Sun, 15 Oct 2017 00:37:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/Merging-two-databases-based-on-fuzzy-dates-similar-dates/m-p/404203#M12297</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2017-10-15T00:37:16Z</dc:date>
    </item>
    <item>
      <title>Re: Merging two databases based on fuzzy dates (similar dates)</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/Merging-two-databases-based-on-fuzzy-dates-similar-dates/m-p/404270#M12305</link>
      <description>&lt;P&gt;Added note:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/12447"&gt;@Patrick&lt;/a&gt;'s minimum distance criteria is equivalent to:&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;&lt;STRONG&gt;min(sumabs(a.start-b.start, a.end-b.end)) = sumabs(a.start-b.start, a.end-b.end)&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;but it could also be:&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;&lt;STRONG&gt;min(euclid(a.start-b.start, a.end-b.end)) = euclid(a.start-b.start, a.end-b.end)&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;both of which try to match start and end times. You could also try to match the time interval centers with:&lt;BR /&gt;&lt;BR /&gt;&lt;STRONG&gt;min(abs(a.start + a.end - b.start - b.end)) = abs(a.start + a.end - b.start - b.end)&lt;/STRONG&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 15 Oct 2017 05:09:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/Merging-two-databases-based-on-fuzzy-dates-similar-dates/m-p/404270#M12305</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2017-10-15T05:09:56Z</dc:date>
    </item>
    <item>
      <title>Re: Merging two databases based on fuzzy dates (similar dates)</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/Merging-two-databases-based-on-fuzzy-dates-similar-dates/m-p/405674#M12379</link>
      <description>&lt;P&gt;First, thank you for your help. For the most part, this has been working, with the exception of what you had mentioned, when two observations (with the same SN) appear in table a, but only once in table b, they will both be included in table want, but both will&amp;nbsp;correspond to the same dates and variables as the one case from table b&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;One additional question, in the following line:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE class=" language-sas"&gt;&lt;CODE class="  language-sas"&gt;      having &lt;SPAN class="token function"&gt;min&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token function"&gt;abs&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;(&lt;/SPAN&gt;a&lt;SPAN class="token punctuation"&gt;.&lt;/SPAN&gt;start&lt;SPAN class="token operator"&gt;-&lt;/SPAN&gt;b&lt;SPAN class="token punctuation"&gt;.&lt;/SPAN&gt;start&lt;SPAN class="token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN class="token operator"&gt;+&lt;/SPAN&gt;&lt;SPAN class="token function"&gt;abs&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;(&lt;/SPAN&gt;a&lt;SPAN class="token punctuation"&gt;.&lt;/SPAN&gt;end&lt;SPAN class="token operator"&gt;-&lt;/SPAN&gt;b&lt;SPAN class="token punctuation"&gt;.&lt;/SPAN&gt;end&lt;SPAN class="token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;)&lt;/SPAN&gt; &lt;SPAN class="token operator"&gt;=&lt;/SPAN&gt; &lt;SPAN class="token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token function"&gt;abs&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;(&lt;/SPAN&gt;a&lt;SPAN class="token punctuation"&gt;.&lt;/SPAN&gt;start&lt;SPAN class="token operator"&gt;-&lt;/SPAN&gt;b&lt;SPAN class="token punctuation"&gt;.&lt;/SPAN&gt;start&lt;SPAN class="token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN class="token operator"&gt;+&lt;/SPAN&gt;&lt;SPAN class="token function"&gt;abs&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;(&lt;/SPAN&gt;a&lt;SPAN class="token punctuation"&gt;.&lt;/SPAN&gt;end&lt;SPAN class="token operator"&gt;-&lt;/SPAN&gt;b&lt;SPAN class="token punctuation"&gt;.&lt;/SPAN&gt;end&lt;SPAN class="token punctuation"&gt;))&lt;/SPAN&gt;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;what is the purpose of the syntax after the equals (=) sign?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Many thanks!&lt;/P&gt;</description>
      <pubDate>Thu, 19 Oct 2017 17:53:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/Merging-two-databases-based-on-fuzzy-dates-similar-dates/m-p/405674#M12379</guid>
      <dc:creator>djohn051</dc:creator>
      <dc:date>2017-10-19T17:53:32Z</dc:date>
    </item>
  </channel>
</rss>

