<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic How to fuzzy match two variables from two datasets? in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/How-to-fuzzy-match-two-variables-from-two-datasets/m-p/531202#M145349</link>
    <description>&lt;P&gt;Hello all,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have two datasets.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;DATA HAVE1;
    INPUT (NAME1) (:$8.);
    CARDS;
ERIC Stewart
Eri John
ERI Abe
Eric Mars
Eris
Eric MARSTIN

;
run;

DATA HAVE2;
    INPUT (NAME2) (:$8.);
    CARDS;
Eric Stewart
Eri Johnny
Eri Lee
Eric Swift
Eric
Eric Strong
;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;What I want is to fuzzy match the NAME1 with NAME2. If NAME2 is similar with NAME1, then keep NAME2. Otherwise, remove NAME2.&lt;/P&gt;&lt;P&gt;1. They don't need to be perfectly same. Mostly match would be fine.&lt;/P&gt;&lt;P&gt;2. Regardless of the case. uppercase and lowercase match can be less important. As long as they are the same letter would be OK.&lt;/P&gt;&lt;P&gt;Here is what I want&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;DATA WANT;
    INPUT (NAME2) (:$8.);
    CARDS;
Eric Stewart
Eri Johnny
;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Would this be achieved through SQL (because I have more other variables to group by and control for)&lt;/P&gt;&lt;P&gt;I know some codes such as LIKE CATS. But it seems the code is wrong and keeps processing and never ends.&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;
select
a.*, b.*
from have1 a,
have2 b
where a.name1 LIKE cats('%',b.name2,'%');
quit;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Thank you very much.&lt;/P&gt;&lt;P&gt;Stay warm.&lt;/P&gt;</description>
    <pubDate>Tue, 29 Jan 2019 22:52:55 GMT</pubDate>
    <dc:creator>yanshuai</dc:creator>
    <dc:date>2019-01-29T22:52:55Z</dc:date>
    <item>
      <title>How to fuzzy match two variables from two datasets?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-fuzzy-match-two-variables-from-two-datasets/m-p/531202#M145349</link>
      <description>&lt;P&gt;Hello all,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have two datasets.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;DATA HAVE1;
    INPUT (NAME1) (:$8.);
    CARDS;
ERIC Stewart
Eri John
ERI Abe
Eric Mars
Eris
Eric MARSTIN

;
run;

DATA HAVE2;
    INPUT (NAME2) (:$8.);
    CARDS;
Eric Stewart
Eri Johnny
Eri Lee
Eric Swift
Eric
Eric Strong
;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;What I want is to fuzzy match the NAME1 with NAME2. If NAME2 is similar with NAME1, then keep NAME2. Otherwise, remove NAME2.&lt;/P&gt;&lt;P&gt;1. They don't need to be perfectly same. Mostly match would be fine.&lt;/P&gt;&lt;P&gt;2. Regardless of the case. uppercase and lowercase match can be less important. As long as they are the same letter would be OK.&lt;/P&gt;&lt;P&gt;Here is what I want&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;DATA WANT;
    INPUT (NAME2) (:$8.);
    CARDS;
Eric Stewart
Eri Johnny
;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Would this be achieved through SQL (because I have more other variables to group by and control for)&lt;/P&gt;&lt;P&gt;I know some codes such as LIKE CATS. But it seems the code is wrong and keeps processing and never ends.&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;
select
a.*, b.*
from have1 a,
have2 b
where a.name1 LIKE cats('%',b.name2,'%');
quit;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Thank you very much.&lt;/P&gt;&lt;P&gt;Stay warm.&lt;/P&gt;</description>
      <pubDate>Tue, 29 Jan 2019 22:52:55 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-fuzzy-match-two-variables-from-two-datasets/m-p/531202#M145349</guid>
      <dc:creator>yanshuai</dc:creator>
      <dc:date>2019-01-29T22:52:55Z</dc:date>
    </item>
    <item>
      <title>Re: How to fuzzy match two variables from two datasets?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-fuzzy-match-two-variables-from-two-datasets/m-p/531210#M145353</link>
      <description>&lt;P&gt;You could use a spelling distance function such as COMPLEV:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;
select *
from have2
where exists(select * from have1 where complev(name1, have2.name2, 4) &amp;lt; 4);  
quit;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 29 Jan 2019 23:38:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-fuzzy-match-two-variables-from-two-datasets/m-p/531210#M145353</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2019-01-29T23:38:38Z</dc:date>
    </item>
    <item>
      <title>Re: How to fuzzy match two variables from two datasets?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-fuzzy-match-two-variables-from-two-datasets/m-p/532461#M145897</link>
      <description>&lt;P&gt;Thank you very much.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I use your code and also look for previous posts with similar questions.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The final coding I figure out is like this&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;
select *
from have1, have2
where compged(have1.name1, have2.name2, 2, 'INL') &amp;lt; 2;
quit;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;I am using COMPEGD and it gives me fairly good result. It also runs very quickly though it still involves cartesian product.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you very much.&lt;/P&gt;&lt;P&gt;Hope this can also be helpful to others.&lt;/P&gt;</description>
      <pubDate>Sun, 03 Feb 2019 22:39:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-fuzzy-match-two-variables-from-two-datasets/m-p/532461#M145897</guid>
      <dc:creator>yanshuai</dc:creator>
      <dc:date>2019-02-03T22:39:12Z</dc:date>
    </item>
  </channel>
</rss>

