<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to perform n-gram analysis using Enterprise miner ? in SAS Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Data-Science/How-to-perform-n-gram-analysis-using-Enterprise-miner/m-p/360412#M9793</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;You can apply ngrams via the fcmp procedure. &amp;nbsp;It's been a while since I've used EM and can't remember if it is included in it or not. &amp;nbsp;The below sample implements a simple ngram algorithm.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc fcmp outlib=work.dq.func;
     function ngram(string1 $,string2 $,len);
     s1 = upcase(compress(string1,,'kan'));
     s2 = upcase(compress(string2,,'kan'));
     score=0;
     do index = 1 to (length(s1)-1);
           if find(s2,substr(s1,index,len)) then score+1;
     end;
     do index = 1 to (length(s2)-1);
           if find(s1,substr(s2,index,len)) then score+1;
     end;
     score = score/2;
     score_pct = score / (max(length(s1)-1,length(s2)-1));
     return(score_pct);
     endsub;
run;
 
options cmplib=work.dq;
data tests;
     length s1 s2 $50;
     infile datalines dsd dlm='|';
     input s1 $ s2 $;
     ngram = ngram(s1,s2,2);
cards;
Acme Inc.| Acme Integrated Technologies
Acme | Acme Inc.
Acme | Acme
Smith,John| John Smith
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
    <pubDate>Mon, 22 May 2017 12:19:44 GMT</pubDate>
    <dc:creator>foobarbaz</dc:creator>
    <dc:date>2017-05-22T12:19:44Z</dc:date>
    <item>
      <title>How to perform n-gram analysis using Enterprise miner ?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/How-to-perform-n-gram-analysis-using-Enterprise-miner/m-p/356696#M9792</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I wonder if anyone can please help me? I am looking to perform n-gram analysis using SAS EM.&lt;/P&gt;&lt;P&gt;I have few datasets couple of them have important text and I applied Text Mining nodes particularly Text Parsing, Text Filter etc.&lt;/P&gt;&lt;P&gt;I am looking to know whether n-gram analysis is a part of any particular node or is there any other way I should do it.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Kind regards&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 07 May 2017 10:30:47 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/How-to-perform-n-gram-analysis-using-Enterprise-miner/m-p/356696#M9792</guid>
      <dc:creator>geniusgenie</dc:creator>
      <dc:date>2017-05-07T10:30:47Z</dc:date>
    </item>
    <item>
      <title>Re: How to perform n-gram analysis using Enterprise miner ?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/How-to-perform-n-gram-analysis-using-Enterprise-miner/m-p/360412#M9793</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;You can apply ngrams via the fcmp procedure. &amp;nbsp;It's been a while since I've used EM and can't remember if it is included in it or not. &amp;nbsp;The below sample implements a simple ngram algorithm.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc fcmp outlib=work.dq.func;
     function ngram(string1 $,string2 $,len);
     s1 = upcase(compress(string1,,'kan'));
     s2 = upcase(compress(string2,,'kan'));
     score=0;
     do index = 1 to (length(s1)-1);
           if find(s2,substr(s1,index,len)) then score+1;
     end;
     do index = 1 to (length(s2)-1);
           if find(s1,substr(s2,index,len)) then score+1;
     end;
     score = score/2;
     score_pct = score / (max(length(s1)-1,length(s2)-1));
     return(score_pct);
     endsub;
run;
 
options cmplib=work.dq;
data tests;
     length s1 s2 $50;
     infile datalines dsd dlm='|';
     input s1 $ s2 $;
     ngram = ngram(s1,s2,2);
cards;
Acme Inc.| Acme Integrated Technologies
Acme | Acme Inc.
Acme | Acme
Smith,John| John Smith
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 22 May 2017 12:19:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/How-to-perform-n-gram-analysis-using-Enterprise-miner/m-p/360412#M9793</guid>
      <dc:creator>foobarbaz</dc:creator>
      <dc:date>2017-05-22T12:19:44Z</dc:date>
    </item>
    <item>
      <title>Re: How to perform n-gram analysis using Enterprise miner ?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/How-to-perform-n-gram-analysis-using-Enterprise-miner/m-p/362410#M9794</link>
      <description>Hi Foobarbaz, thanks for your reply, could you please tell me how can I run this code in EM? and do I need to attach my data partition or file import nodes with it??</description>
      <pubDate>Mon, 29 May 2017 02:57:21 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/How-to-perform-n-gram-analysis-using-Enterprise-miner/m-p/362410#M9794</guid>
      <dc:creator>geniusgenie</dc:creator>
      <dc:date>2017-05-29T02:57:21Z</dc:date>
    </item>
  </channel>
</rss>

