<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Help in Jaro Wrinkler/Edit distance Similarity in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Help-in-Jaro-Wrinkler-Edit-distance-Similarity/m-p/671661#M201723</link>
    <description>&lt;P&gt;Unfortunately there are no exact equivalents in SAS. The closest that come to edit distances are compged and speeds functions. You have to understand that the main difference in edit distances functions is the weight that is given to operations. Unless you have SAS text miner or SAS Viya. If you have SAS Viya you can leverage python and py_stringmatching package. (&lt;SPAN&gt;AnHai Doan, Alon Halevy, Zachary Ives, “Principles of Data Integration”, Morgan Kaufmann, 2012. Chapter 4 “String Matching” (available on the package’s homepage).). You can develop these functions by yourself in SAS if you have time and patience to write your own algorithm.&lt;/SPAN&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Thu, 23 Jul 2020 04:13:43 GMT</pubDate>
    <dc:creator>smantha</dc:creator>
    <dc:date>2020-07-23T04:13:43Z</dc:date>
    <item>
      <title>Help in Jaro Wrinkler/Edit distance Similarity</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Help-in-Jaro-Wrinkler-Edit-distance-Similarity/m-p/671660#M201722</link>
      <description>&lt;P&gt;&lt;BR /&gt;Hi,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I was tasked to convert SAS programs to Hive queries in EAP. Since I am new to hive codes, I am not sure what is the equivalent code for the below SAS function in hive.&lt;/P&gt;&lt;P&gt;utl_match.jaro_winkler_similarity (REPLACE (UPPER(name1), ' ', ''), REPLACE (UPPER(name2), ' ', '')) AS j_score,&lt;/P&gt;&lt;P&gt;utl_match.edit_distance_similarity (REPLACE (UPPER(name1), ' ', ''), REPLACE (UPPER(name2), ' ', '')) AS e_d_score&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I would appreciate if someone help me to provide the equivalent sas function code in Hive.&lt;/P&gt;&lt;P&gt;Thanks for checking.&lt;/P&gt;</description>
      <pubDate>Thu, 23 Jul 2020 03:48:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Help-in-Jaro-Wrinkler-Edit-distance-Similarity/m-p/671660#M201722</guid>
      <dc:creator>Kalai2008</dc:creator>
      <dc:date>2020-07-23T03:48:13Z</dc:date>
    </item>
    <item>
      <title>Re: Help in Jaro Wrinkler/Edit distance Similarity</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Help-in-Jaro-Wrinkler-Edit-distance-Similarity/m-p/671661#M201723</link>
      <description>&lt;P&gt;Unfortunately there are no exact equivalents in SAS. The closest that come to edit distances are compged and speeds functions. You have to understand that the main difference in edit distances functions is the weight that is given to operations. Unless you have SAS text miner or SAS Viya. If you have SAS Viya you can leverage python and py_stringmatching package. (&lt;SPAN&gt;AnHai Doan, Alon Halevy, Zachary Ives, “Principles of Data Integration”, Morgan Kaufmann, 2012. Chapter 4 “String Matching” (available on the package’s homepage).). You can develop these functions by yourself in SAS if you have time and patience to write your own algorithm.&lt;/SPAN&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 23 Jul 2020 04:13:43 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Help-in-Jaro-Wrinkler-Edit-distance-Similarity/m-p/671661#M201723</guid>
      <dc:creator>smantha</dc:creator>
      <dc:date>2020-07-23T04:13:43Z</dc:date>
    </item>
  </channel>
</rss>

