<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: indexing based on a different dataset? in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/indexing-based-on-a-different-dataset/m-p/441991#M110559</link>
    <description>&lt;P&gt;It’s a fuzzy join/merge that is data intensive because you essentially have to compare every single record in each file against every other record.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You can can look at some of the options for fuzzy lookups such as&lt;/P&gt;
&lt;P&gt;compged, soundex, complev.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you have another field you can join on as well, such as birth dates, age or facility that can signicantly reduce the number of comparisons.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Maybe the answer here from friedegg would be helpful.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;A href="https://communities.sas.com/t5/SAS-Procedures/Name-matching/td-p/82780" target="_blank"&gt;https://communities.sas.com/t5/SAS-Procedures/Name-matching/td-p/82780&lt;/A&gt;&lt;/P&gt;</description>
    <pubDate>Sat, 03 Mar 2018 01:32:13 GMT</pubDate>
    <dc:creator>Reeza</dc:creator>
    <dc:date>2018-03-03T01:32:13Z</dc:date>
    <item>
      <title>indexing based on a different dataset?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/indexing-based-on-a-different-dataset/m-p/441987#M110557</link>
      <description>&lt;P&gt;I have 2 datasets: main and mds&lt;/P&gt;&lt;P&gt;The mds dataset has pcp_names in 1 variable, and pcp first_name, last_name, and middle_initial also as 3 separate variables.&lt;/P&gt;&lt;P&gt;The main dataset also has a pcp variable but not necessarily in the same format as the pcp_names variable in the mds dataset (may have MD attached, may not have middle initial, etc.).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I want to flag variables in my main dataset who have a pcp in the mds dataset. We can translate this to mean that if there is some sort of combination of pcp first_name and pcp last_name in my mds dataset in the string variable pcp in my main dataset, then keep this or flag it 1. Any help on how to do this with 2 separate datasets would be really appreciated.&lt;/P&gt;</description>
      <pubDate>Sat, 03 Mar 2018 01:14:43 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/indexing-based-on-a-different-dataset/m-p/441987#M110557</guid>
      <dc:creator>Melk</dc:creator>
      <dc:date>2018-03-03T01:14:43Z</dc:date>
    </item>
    <item>
      <title>Re: indexing based on a different dataset?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/indexing-based-on-a-different-dataset/m-p/441991#M110559</link>
      <description>&lt;P&gt;It’s a fuzzy join/merge that is data intensive because you essentially have to compare every single record in each file against every other record.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You can can look at some of the options for fuzzy lookups such as&lt;/P&gt;
&lt;P&gt;compged, soundex, complev.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you have another field you can join on as well, such as birth dates, age or facility that can signicantly reduce the number of comparisons.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Maybe the answer here from friedegg would be helpful.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;A href="https://communities.sas.com/t5/SAS-Procedures/Name-matching/td-p/82780" target="_blank"&gt;https://communities.sas.com/t5/SAS-Procedures/Name-matching/td-p/82780&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 03 Mar 2018 01:32:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/indexing-based-on-a-different-dataset/m-p/441991#M110559</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2018-03-03T01:32:13Z</dc:date>
    </item>
    <item>
      <title>Re: indexing based on a different dataset?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/indexing-based-on-a-different-dataset/m-p/442002#M110564</link>
      <description>&lt;P&gt;Thank you&amp;nbsp; - I am looking into those functions. I think it essentially needs to be something like an indexc embedded in a hash, but I am not sure if I can even do that.&lt;/P&gt;</description>
      <pubDate>Sat, 03 Mar 2018 02:27:24 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/indexing-based-on-a-different-dataset/m-p/442002#M110564</guid>
      <dc:creator>Melk</dc:creator>
      <dc:date>2018-03-03T02:27:24Z</dc:date>
    </item>
  </channel>
</rss>

