<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Joining similar data and arbitrarily forcing it to be 1:1 with no duplicates in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Joining-similar-data-and-arbitrarily-forcing-it-to-be-1-1-with/m-p/779566#M248305</link>
    <description>&lt;P&gt;I am not quite sure what you want to do with the "waterfall", but is it prioritizing the matches by the time interval?&lt;/P&gt;
&lt;P&gt;Then the solution might be to order by the date difference as well:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;                                       
  create table join as select                   
    table_1.key,table_2.crit as crit_2,count(*) as N,
    abs(table_1.date-table_2.date) as date_diff     
  from table_1 join table_2                
  on table_1.amount=table_2.amount and          
     table_1.amount_num=table_2.amount_num and  
     abs(table_1.date-table_2.date) between 0 and 12
  group by table_2.crit                          
  ;                                             
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Then use that for sorting as well:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sort;             
  by crit_2 date_diff n;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;And do the rest as in my original post.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Hope this is something like what you want!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Wed, 10 Nov 2021 14:30:53 GMT</pubDate>
    <dc:creator>s_lassen</dc:creator>
    <dc:date>2021-11-10T14:30:53Z</dc:date>
    <item>
      <title>Joining similar data and arbitrarily forcing it to be 1:1 with no duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Joining-similar-data-and-arbitrarily-forcing-it-to-be-1-1-with/m-p/778345#M247751</link>
      <description>&lt;P&gt;DATA WORK.TABLE_1;&lt;BR /&gt;INPUT KEY AMOUNT AMOUNT_NUM $ DATE :DDMMYY10.;&lt;BR /&gt;FORMAT DATE DDMMYY10.;&lt;BR /&gt;CARDS;&lt;BR /&gt;1 10 000123 01/10/2021&lt;BR /&gt;2 10 000123 01/10/2021&lt;BR /&gt;3 10 000123 01/10/2021&lt;BR /&gt;;&lt;BR /&gt;RUN;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;P&gt;DATA WORK.TABLE_2;&lt;BR /&gt;INPUT CRIT AMOUNT AMOUNT_NUM $ DATE :DDMMYY10.;&lt;BR /&gt;FORMAT DATE DDMMYY10.;&lt;BR /&gt;CARDS;&lt;BR /&gt;1001 10 000123 01/10/2021&lt;BR /&gt;1002 10 000123 01/10/2021&lt;BR /&gt;1003 10 000123 01/10/2021&lt;BR /&gt;;&lt;BR /&gt;RUN;&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;DATA WORK.HASH_EXAMPLE;&lt;BR /&gt;SET WORK.TABLE_1;&lt;BR /&gt;CRIT=.;&lt;BR /&gt;IF _N_ = 1&lt;BR /&gt;THEN DO;&lt;BR /&gt;DECLARE HASH B (DATASET:"WORK.TABLE_2 (RENAME=(DATE=B_DATE))");&lt;BR /&gt;B.DEFINEKEY("AMOUNT_NUM","AMOUNT","B_DATE");&lt;BR /&gt;B.DEFINEDATA("CRIT");&lt;BR /&gt;B.DEFINEDONE();&lt;/P&gt;
&lt;P&gt;END;&lt;BR /&gt;B_DATE = DATE;&lt;BR /&gt;RC = B.FIND();&lt;/P&gt;
&lt;P&gt;DO WHILE (RC NE 0 AND B_DATE &amp;lt; DATE + 12);&lt;BR /&gt;B_DATE + 1; &lt;BR /&gt;RC = B.FIND();&lt;BR /&gt;END;&lt;BR /&gt;IF RC=0&lt;BR /&gt;THEN DO;&lt;BR /&gt;RC=B.REMOVE();&lt;BR /&gt;END;&lt;BR /&gt;DROP RC B_DATE;&lt;BR /&gt;RUN;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;P&gt;DATA WORK.TABLE_3;&lt;BR /&gt;INPUT CRIT AMOUNT AMOUNT_NUM $ DATE :DDMMYY10. CRIT;&lt;BR /&gt;FORMAT DATE DDMMYY10.;&lt;BR /&gt;CARDS;&lt;BR /&gt;1 10 000123 01/10/2021 1001&lt;BR /&gt;2 10 000123 01/10/2021 1002&lt;BR /&gt;3 10 000123 01/10/2021 1003&lt;BR /&gt;;&lt;BR /&gt;RUN;&lt;BR /&gt;&lt;BR /&gt;So with the above program, I am starting with Table_1 as my source set of data, Table_2 is the table I am trying to left join into.&amp;nbsp; The hash example is what I thought was the best way to accomplish the end output I wanted, which is table 3.&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Wants:&amp;nbsp; To take Table_1, and based on 3 criteria (one of them being a date that I want to join with a 12 day range), to get a matching "Crit".&amp;nbsp; Anytime a Crit is matched to a given Key, I no longer want to evaluate the matched Key or Crit values.&amp;nbsp; In theory, I would be fine if Key 1 matched with Crit 1003, per the example, as long as Keys 2 and 3 matched with Crits 1002 and 1001 in someway.&amp;nbsp; Any thoughts on the best way of accomplishing this?&lt;/P&gt;</description>
      <pubDate>Wed, 03 Nov 2021 21:39:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Joining-similar-data-and-arbitrarily-forcing-it-to-be-1-1-with/m-p/778345#M247751</guid>
      <dc:creator>lawatkey</dc:creator>
      <dc:date>2021-11-03T21:39:58Z</dc:date>
    </item>
    <item>
      <title>Re: Joining similar data and arbitrarily forcing it to be 1:1 with no duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Joining-similar-data-and-arbitrarily-forcing-it-to-be-1-1-with/m-p/778688#M247902</link>
      <description>&lt;P&gt;Note: I edited this reply so show the complete solution instead of just showing a suggested change (my second post).&lt;/P&gt;
&lt;P&gt;You data are not very representative, as all records match. But here is a solution that may work:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;First, create a table of all possible joins:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;                                       
  create table join as select                   
    table_1.key,table_2.crit as crit_2,count(*) as N      
  from table_1 join table_2                
  on table_1.amount=table_2.amount and          
     table_1.amount_num=table_2.amount_num and  
     abs(table_1.date-table_2.date) between 0 and 12
  group by table_2.crit                          
  ;                                             
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Then sort by CRIT_2 and number of KEYs&amp;nbsp; for that CRIT:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sort;             
  by crit_2 n;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Create the output table, with an index:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want(index=(key));
  set table_1;         
  crit=.;              
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Finally, update the output with CRIT values from the join:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want;                                                        
  found=0;                                                        
  do until(last.crit_2);                                          
    set join;                                                     
    by crit_2;                                                    
    modify want key=key/unique; /* Changed: use /UNIQUE to be able to read the same key twice */                                          
    if found then continue; /* this CRIT has already been used */ 
    if crit=. then do; /* This KEY has not been matched yet */    
      crit=crit_2;                                                
      replace;                                                    
      found=1;  /* This CRIT has now been used */                                                  
      end;                                                        
    end;                                                          
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;This solution is not guaranteed to get the best match, but because we use the joins with fewest possible matches first, it is a good approximation.&lt;/P&gt;</description>
      <pubDate>Mon, 08 Nov 2021 09:42:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Joining-similar-data-and-arbitrarily-forcing-it-to-be-1-1-with/m-p/778688#M247902</guid>
      <dc:creator>s_lassen</dc:creator>
      <dc:date>2021-11-08T09:42:16Z</dc:date>
    </item>
    <item>
      <title>Re: Joining similar data and arbitrarily forcing it to be 1:1 with no duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Joining-similar-data-and-arbitrarily-forcing-it-to-be-1-1-with/m-p/778729#M247916</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/76464"&gt;@s_lassen&lt;/a&gt; I like the program you posted, thank you!&amp;nbsp; Unfortunately when applying to my actual datasets, I get "ERROR: No matching observation was found in MASTER data set."&amp;nbsp; Not sure how it's possible based on the initial join criteria for table WANT.&amp;nbsp; I even tried to make table WANT from Table_1 where the key was filtered based solely on the keys that matched in the JOIN table, no dice. I imagine the issue is if a Crit is matched to a Key and then another Key is evaluated where its only match was that last Crit that was matched, then it would have no remaining matches left.&amp;nbsp; Is there a way to modify the IF statement in the last step to just leave the crit as blank or a . if that scenario happens?&lt;/P&gt;</description>
      <pubDate>Fri, 05 Nov 2021 12:39:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Joining-similar-data-and-arbitrarily-forcing-it-to-be-1-1-with/m-p/778729#M247916</guid>
      <dc:creator>lawatkey</dc:creator>
      <dc:date>2021-11-05T12:39:34Z</dc:date>
    </item>
    <item>
      <title>Re: Joining similar data and arbitrarily forcing it to be 1:1 with no duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Joining-similar-data-and-arbitrarily-forcing-it-to-be-1-1-with/m-p/779075#M248094</link>
      <description>&lt;P&gt;I think I know what the problem is: you are looking for the same key twice, which may happen with real life data.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I think you should change&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;modify want key=key;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;to&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;modify want key=key/unique;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;That may help. I edited my first reply to show the complete solution after the change.&lt;/P&gt;</description>
      <pubDate>Mon, 08 Nov 2021 09:43:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Joining-similar-data-and-arbitrarily-forcing-it-to-be-1-1-with/m-p/779075#M248094</guid>
      <dc:creator>s_lassen</dc:creator>
      <dc:date>2021-11-08T09:43:31Z</dc:date>
    </item>
    <item>
      <title>Re: Joining similar data and arbitrarily forcing it to be 1:1 with no duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Joining-similar-data-and-arbitrarily-forcing-it-to-be-1-1-with/m-p/779556#M248297</link>
      <description>&lt;P&gt;Thank you very much! The code works quite well, that's amazing &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&amp;nbsp; I might try to make it into a macro to start with a 0 day range to "prioritize" closer matches then waterfall down and expand the date range by 1 on each end per loop, unless you know of a simpler way, like an order by statement in the join, but if not what you've provided is still great, I really appreciate it.&lt;/P&gt;</description>
      <pubDate>Wed, 10 Nov 2021 14:00:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Joining-similar-data-and-arbitrarily-forcing-it-to-be-1-1-with/m-p/779556#M248297</guid>
      <dc:creator>lawatkey</dc:creator>
      <dc:date>2021-11-10T14:00:08Z</dc:date>
    </item>
    <item>
      <title>Re: Joining similar data and arbitrarily forcing it to be 1:1 with no duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Joining-similar-data-and-arbitrarily-forcing-it-to-be-1-1-with/m-p/779566#M248305</link>
      <description>&lt;P&gt;I am not quite sure what you want to do with the "waterfall", but is it prioritizing the matches by the time interval?&lt;/P&gt;
&lt;P&gt;Then the solution might be to order by the date difference as well:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;                                       
  create table join as select                   
    table_1.key,table_2.crit as crit_2,count(*) as N,
    abs(table_1.date-table_2.date) as date_diff     
  from table_1 join table_2                
  on table_1.amount=table_2.amount and          
     table_1.amount_num=table_2.amount_num and  
     abs(table_1.date-table_2.date) between 0 and 12
  group by table_2.crit                          
  ;                                             
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Then use that for sorting as well:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sort;             
  by crit_2 date_diff n;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;And do the rest as in my original post.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Hope this is something like what you want!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 10 Nov 2021 14:30:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Joining-similar-data-and-arbitrarily-forcing-it-to-be-1-1-with/m-p/779566#M248305</guid>
      <dc:creator>s_lassen</dc:creator>
      <dc:date>2021-11-10T14:30:53Z</dc:date>
    </item>
    <item>
      <title>Re: Joining similar data and arbitrarily forcing it to be 1:1 with no duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Joining-similar-data-and-arbitrarily-forcing-it-to-be-1-1-with/m-p/779570#M248308</link>
      <description>&lt;P&gt;I didn't read your whole thread but have you checked out Propensity Score matching algorithms?&amp;nbsp; SAS even has a PSMATCH procedure now.&lt;/P&gt;</description>
      <pubDate>Wed, 10 Nov 2021 14:34:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Joining-similar-data-and-arbitrarily-forcing-it-to-be-1-1-with/m-p/779570#M248308</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2021-11-10T14:34:05Z</dc:date>
    </item>
  </channel>
</rss>

