<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Speling Korrecter in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Speling-Korrecter/m-p/33172#M6440</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;The method in which I am trying this currently uses compged because I am looking only for words with a maximum edit distance of 2 (according to the article).&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Tue, 25 Oct 2011 15:48:46 GMT</pubDate>
    <dc:creator>FriedEgg</dc:creator>
    <dc:date>2011-10-25T15:48:46Z</dc:date>
    <item>
      <title>Speling Korrecter</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Speling-Korrecter/m-p/33164#M6432</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I read a very interesting article today.&amp;nbsp; 'How to Write a Spelling Corrector" by Peter Norvig.&amp;nbsp; In this article he outlines the process Google uses to to do it's 'Did you mean:' new spelling suggestions.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Here is a link to the article: &lt;/SPAN&gt;&lt;A class="jive-link-external-small" href="http://norvig.com/spell-correct.html"&gt;http://norvig.com/spell-correct.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt; So proc spell can identify misspelling and even provide suggestions of new words if you use the suggest option but it cannot, as far as I know, replace identified text with a new word.&amp;nbsp; So what I am suggesting is that taking the vast information and algorithms outlined in the paper by Peter Norvig and you create, ideally a version or the correct function using proc fcmp, or another process by which to suggest a proper spelling to a given string or return the original string if it is a proper word.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 21 Oct 2011 21:06:24 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Speling-Korrecter/m-p/33164#M6432</guid>
      <dc:creator>FriedEgg</dc:creator>
      <dc:date>2011-10-21T21:06:24Z</dc:date>
    </item>
    <item>
      <title>Speling Korrecter</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Speling-Korrecter/m-p/33165#M6433</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P style="font-size: 16px; margin-bottom: 17px;"&gt;&lt;SPAN style="color: #c00;"&gt;Did you mean:&lt;/SPAN&gt; &lt;A href="http://www.google.com/search?hl=en&amp;amp;q=Spelling%20Corrector&amp;amp;spell=1&amp;amp;sa=X"&gt;&lt;STRONG&gt;&lt;EM&gt;Spelling Corrector&lt;/EM&gt;&lt;/STRONG&gt;&lt;/A&gt;&lt;/P&gt;&lt;P&gt; :smileydevil:&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 21 Oct 2011 21:55:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Speling-Korrecter/m-p/33165#M6433</guid>
      <dc:creator>data_null__</dc:creator>
      <dc:date>2011-10-21T21:55:53Z</dc:date>
    </item>
    <item>
      <title>Speling Korrecter</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Speling-Korrecter/m-p/33166#M6434</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Data Null: Did you mean: Punctuation Korrector&lt;STRONG&gt;?&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;FriedEgg:&amp;nbsp; In spite of the fact that Proc Spell is no longer documented (although I do have a copy of the original documentation), I think it is an excellent suggestion and might just be enough to get SAS to re-document proc Spell.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 21 Oct 2011 22:07:04 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Speling-Korrecter/m-p/33166#M6434</guid>
      <dc:creator>art297</dc:creator>
      <dc:date>2011-10-21T22:07:04Z</dc:date>
    </item>
    <item>
      <title>Speling Korrecter</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Speling-Korrecter/m-p/33167#M6435</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I am very interested with it.&lt;/P&gt;&lt;P&gt;&lt;A href="http://en.wikipedia.org/wiki/Bayes%27_theorem"&gt;Bayes' Theorem&lt;/A&gt; is a very old statistical theory.&lt;/P&gt;&lt;P&gt;Some statistician approve ,while some oppose .&lt;/P&gt;&lt;P&gt;But Fortunately, SAS has already offer a function spedis to measure the distance of pronunciation.&lt;/P&gt;&lt;P&gt;So with the help of Hash Table, I think it will be easy to achieve in SAS.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Ksharp&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 24 Oct 2011 09:55:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Speling-Korrecter/m-p/33167#M6435</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2011-10-24T09:55:53Z</dc:date>
    </item>
    <item>
      <title>Speling Korrecter</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Speling-Korrecter/m-p/33168#M6436</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Of course two problems/limitations that would have to be overcome regarding proc spell is that (1) the documentation doesn't provide any clue (at least that I could find) regarding how to access the dictionary and (2) it doesn't provide a way (at least that I could find) of capturing its output.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 24 Oct 2011 13:20:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Speling-Korrecter/m-p/33168#M6436</guid>
      <dc:creator>art297</dc:creator>
      <dc:date>2011-10-24T13:20:50Z</dc:date>
    </item>
    <item>
      <title>Speling Korrecter</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Speling-Korrecter/m-p/33169#M6437</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Yes, the elusive sashelp.base.master.dictnary, I cannot figure out how to utilize this catalog either...&amp;nbsp; It seems to maybe be stored in a similar method to stored format, maybe.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 24 Oct 2011 15:52:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Speling-Korrecter/m-p/33169#M6437</guid>
      <dc:creator>FriedEgg</dc:creator>
      <dc:date>2011-10-24T15:52:16Z</dc:date>
    </item>
    <item>
      <title>Speling Korrecter</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Speling-Korrecter/m-p/33170#M6438</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I am so frustrated that spedis() looks like can not work for this situation.&lt;/P&gt;&lt;P&gt;is there someone intend to try it as Peter's algorithm.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 25 Oct 2011 08:26:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Speling-Korrecter/m-p/33170#M6438</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2011-10-25T08:26:32Z</dc:date>
    </item>
    <item>
      <title>Speling Korrecter</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Speling-Korrecter/m-p/33171#M6439</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Why do you think it wouldn't work?&amp;nbsp; I would think that given a string that isn't an exact match any of the entries in a dictionary, that using spedis, complev and or compged comparing that string with all of the other strings in the dictionary would have an excellent chance of solving the problem.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 25 Oct 2011 13:37:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Speling-Korrecter/m-p/33171#M6439</guid>
      <dc:creator>art297</dc:creator>
      <dc:date>2011-10-25T13:37:45Z</dc:date>
    </item>
    <item>
      <title>Speling Korrecter</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Speling-Korrecter/m-p/33172#M6440</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;The method in which I am trying this currently uses compged because I am looking only for words with a maximum edit distance of 2 (according to the article).&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 25 Oct 2011 15:48:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Speling-Korrecter/m-p/33172#M6440</guid>
      <dc:creator>FriedEgg</dc:creator>
      <dc:date>2011-10-25T15:48:46Z</dc:date>
    </item>
    <item>
      <title>Re: Speling Korrecter</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Speling-Korrecter/m-p/33173#M6441</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I found complev() is much better. The dictionary is that offered by FriedEgg.&lt;/P&gt;&lt;P&gt;But I am still interested with Peter's algorithm - &lt;A class="jive-link-external-small" href="http://en.wikipedia.org/wiki/Bayes%27_theorem"&gt;Bayes' Theorem&lt;/A&gt;.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;filename x 'c:\unix-words';
data dictionary;
 infile x truncover;
 input word : $20.;
run;

%let input_word=Korrecter;
data _null_;
 if 0 then set dictionary;
 declare hash ha(hashexp:20,dataset : 'work.dictionary');
 declare hiter hi('ha');
&amp;nbsp; ha.definekey('word');
&amp;nbsp; ha.definedata('word');
&amp;nbsp; ha.definedone();
d=10000;
do while(hi.next()=0);
 dd=complev("&amp;amp;input_word",word,'i');
 if d gt dd then do;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; d=dd;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; want_word=word;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; end;
end; 
put 'WORD: ' "&amp;amp;input_word" +2 'Did you mean: ' want_word;
stop;
run;
 
 
&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Ksharp&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 26 Oct 2011 03:42:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Speling-Korrecter/m-p/33173#M6441</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2011-10-26T03:42:53Z</dc:date>
    </item>
    <item>
      <title>Re: Speling Korrecter</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Speling-Korrecter/m-p/33174#M6442</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Here it what I have at this point.&amp;nbsp; It is not exactly what is being done by Peter Norvig, but it is where I am so far.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE __default_attr="plain" __jive_macro_name="code" class="jive_text_macro jive_macro_code"&gt;&lt;PRE&gt;* Based on : &lt;A href="http://norvig.com/spell-correct.html" target="_blank"&gt;http://norvig.com/spell-correct.html&lt;/A&gt;;
 
%let word=speling;
 
filename big '/temp/big.txt'; * &lt;A href="http://norvig.com/big.txt" target="_blank"&gt;http://norvig.com/big.txt&lt;/A&gt;;
filename words '/usr/share/dict/words'; * unix default dictionary (provided in my other word puzzle related posts);
 
data words;
 infile words truncover;
 input word $upcase48.;
run;
 
data big;
 length word $48;
 infile big lrecl=1024 truncover;
 input @;
 _infile_=compbl(prxchange('s/[^A-Z]/ /i',-1,_infile_));
 if _infile_ ne '' then
&amp;nbsp; do i=1 to countw(_infile_,' ');
&amp;nbsp;&amp;nbsp; word=upcase(scan(_infile_,i,' '));
&amp;nbsp;&amp;nbsp; if word ne '' then output;
&amp;nbsp; end;
 drop i;
run;
 
data words;
 set words big;
 word=strip(word);
run;
 
proc freq data=words;
 tables word /list out=wfreq(drop=percent) noprint;
run;
 
%macro wf_find; *to avoid repeating this code block below for each correction type;
 if wf.find()=0 then
&amp;nbsp; do;
&amp;nbsp;&amp;nbsp; clev=complev(orig_word,word);
&amp;nbsp;&amp;nbsp; if clev&amp;lt;=2 then output;
&amp;nbsp; end;
%mend;
 
data corrections;
 length word a b c $48;
 orig_word=upcase("&amp;amp;word");
 alphabet='ABCDEFGHIJKLMNOPQRSTUVWXYZ';
 
 if 0 then set wfreq;
&amp;nbsp; declare hash wf(hashexp:10,dataset:'wfreq');
&amp;nbsp; declare hiter wfi('wf');
&amp;nbsp;&amp;nbsp; wf.definekey('word');
&amp;nbsp;&amp;nbsp; wf.definedata(all:'Y');
&amp;nbsp;&amp;nbsp; wf.definedone();
 
 *replaces;
 do i=1 to length(orig_word);
&amp;nbsp; do ii=1 to 26;
&amp;nbsp;&amp;nbsp; word=orig_word;
&amp;nbsp;&amp;nbsp; substr(word,i,1)=substr(alphabet,ii,1);
&amp;nbsp;&amp;nbsp; %wf_find
&amp;nbsp; end;
 end;
 *deletes;
 do i=1 to length(word);
&amp;nbsp; word=orig_word;
&amp;nbsp; substr(word,i,1)='';
&amp;nbsp; word=compress(word);
&amp;nbsp; %wf_find
 end;
 *transposes;
 do i=1 to length(orig_word)-1;
&amp;nbsp; word=orig_word;
&amp;nbsp; a=substr(word,i,1);
&amp;nbsp; b=substr(word,i+1,1);
&amp;nbsp; substr(word,i,1)=b;
&amp;nbsp; substr(word,i+1,1)=a;
&amp;nbsp; %wf_find
 end;
 *inserts;
 do i=0 to length(orig_word);
&amp;nbsp;&amp;nbsp; word=orig_word;
&amp;nbsp;&amp;nbsp; a=subpad(word,1,i);
&amp;nbsp;&amp;nbsp; b=subpad(word,i+1,length(word)-i);
&amp;nbsp; do ii=1 to 26;
&amp;nbsp;&amp;nbsp; c=substr(alphabet,ii,1);
&amp;nbsp;&amp;nbsp; word=cats(of a c b);
&amp;nbsp;&amp;nbsp; %wf_find
&amp;nbsp; end;
 end;
 *brute - find all words in 'dictionary' that have an edit distance of &amp;lt;= 2, this step should not be necessary because previous method should find all instances, however this is just to be sure;
 do while(wfi.next()=0);
&amp;nbsp; clev=complev(orig_word,word);
&amp;nbsp; if clev&amp;lt;=2 then output;
 end;
 keep orig_word word count clev;
 stop;
run;
 
proc sql;
 select distinct 'Did you mean: ' || strip(word)
&amp;nbsp;&amp;nbsp; from corrections
&amp;nbsp; where clev=( select min(clev) 
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; from corrections ) 
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; and count=( select max(count) 
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; from corrections 
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; where clev=( select min(clev) 
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; from corrections ));
quit;
&lt;/PRE&gt;
&lt;P&gt;&lt;/P&gt;
&lt;/PRE&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 28 Oct 2011 17:50:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Speling-Korrecter/m-p/33174#M6442</guid>
      <dc:creator>FriedEgg</dc:creator>
      <dc:date>2011-10-28T17:50:41Z</dc:date>
    </item>
    <item>
      <title>Re: Speling Korrecter</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Speling-Korrecter/m-p/33175#M6443</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi. &lt;A _jive_internal="true" class="font-color-meta-light localScroll" href="https://communities.sas.com/108305#107822" title="Go to message"&gt;FriedEgg&lt;/A&gt;.&lt;/P&gt;&lt;P&gt;Peter's Bayes method is simple, if you can use complev() of SAS.&lt;/P&gt;&lt;P&gt;Using Hash Table will be very easy and less code.&lt;/P&gt;&lt;P&gt;Of course. I can code it based on Peter's Bayes.It is easy.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;But What I am concerned is how make a function complev() to test the distance of edit by myself.&lt;/P&gt;&lt;P&gt;That is the point. Because Peter did not use the Probability of Bayes Formula FINALLY.&lt;/P&gt;&lt;P&gt;Bayes's Theory is simple but calculation is very complicated.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Ksharp&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 31 Oct 2011 02:55:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Speling-Korrecter/m-p/33175#M6443</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2011-10-31T02:55:09Z</dc:date>
    </item>
    <item>
      <title>Speling Korrecter</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Speling-Korrecter/m-p/33176#M6444</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;The article by Norvig does slightly depart from the method of truely calculating the bayesian probability.&amp;nbsp; Instead it implements a sort of logical replacement...&amp;nbsp; Take the probability of the correction (the shortest edit distance) with the frequency of appearance of the corrected word in our dictionary (big.txt).&amp;nbsp; The best probability will be where the correct word has the shortest edit distance and the highest appearance frequency.&amp;nbsp; This is definitly not calculating the probability, but follows the logic of what the formula is accomplishing, or so Peter departs.&amp;nbsp; He also goes over a vast array of issues that this does have in properly identifying corrections.&amp;nbsp; In a different article, whose source I can no longer remember, I read that at google in their dictionary they use over 10 trillion 4 word strings in their dictionary to aid in the proper identification of spelling corrections (because surrounding words aid in the correction).&amp;nbsp; Here is a example of the issue with this method.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I am meaning to spell 'THEY' but I acctidently type THAY&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;%let word=thay;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;filename big '/nas/sasbox/users/mkastin/big.txt';&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;data big;&lt;/P&gt;&lt;P&gt; length word $48;&lt;/P&gt;&lt;P&gt; infile big lrecl=1024 truncover;&lt;/P&gt;&lt;P&gt; input @;&lt;/P&gt;&lt;P&gt;&amp;nbsp; _infile_=compbl(prxchange('s/[^A-Z]/ /i',-1,_infile_));&lt;/P&gt;&lt;P&gt; if not missing(_infile_) then&lt;/P&gt;&lt;P&gt;&amp;nbsp; do i=1 to countw(_infile_,' ');&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp; word=upcase(scan(_infile_,i,' '));&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp; if word ne '' then output;&lt;/P&gt;&lt;P&gt;&amp;nbsp; end;&lt;/P&gt;&lt;P&gt; drop i;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;proc freq data=big;&lt;/P&gt;&lt;P&gt; tables word /list out=wfreq(drop=percent) noprint;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;data corrections;&lt;/P&gt;&lt;P&gt; if 0 then set wfreq;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt; declare hash wf(hashexp:10,dataset:'wfreq');&lt;/P&gt;&lt;P&gt; declare hiter wfi('wf');&lt;/P&gt;&lt;P&gt;&amp;nbsp; wf.definekey('word');&lt;/P&gt;&lt;P&gt;&amp;nbsp; wf.definedata(all:'Y');&lt;/P&gt;&lt;P&gt;&amp;nbsp; wf.definedone();&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt; orig_word=upcase("&amp;amp;word");&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt; do while(wfi.next()=0);&lt;/P&gt;&lt;P&gt;&amp;nbsp; clev=complev(orig_word,word);&lt;/P&gt;&lt;P&gt;&amp;nbsp; if clev&amp;lt;=2 then output;&lt;/P&gt;&lt;P&gt; end;&lt;/P&gt;&lt;P&gt; keep orig_word word count clev;&lt;/P&gt;&lt;P&gt; stop;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;proc sql noprint;&lt;/P&gt;&lt;P&gt; select min(clev) into :min_clev from corrections;&lt;/P&gt;&lt;P&gt; select max(count) into :max_count from corrections where clev=&amp;amp;min_clev;&lt;/P&gt;&lt;P&gt;quit;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;proc sql;&lt;/P&gt;&lt;P&gt; select distinct 'Did you mean: ' || strip(word)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp; from corrections&lt;/P&gt;&lt;P&gt;&amp;nbsp; where clev=&amp;amp;min_clev and count=&amp;amp;max_count;&lt;/P&gt;&lt;P&gt;quit;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Did you mean: THAT&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;no, I meant 'THEY'...&amp;nbsp; However, if you look at the data (here are my choices with the shortest edit distance, 1):&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;WORD COUNT orig_word clev&lt;/P&gt;&lt;P&gt;HAY&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;42&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;THAY&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;1&lt;/P&gt;&lt;P&gt;THAW&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;2&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;THAY&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;1&lt;/P&gt;&lt;P&gt;THA&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;1&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;THAY&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;1&lt;/P&gt;&lt;P&gt;THAT&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;12423&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;THAY&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;1&lt;/P&gt;&lt;P&gt;THAN&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;1199&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;THAY&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;1&lt;/P&gt;&lt;P&gt;TRAY&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;8&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;THAY&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;1&lt;/P&gt;&lt;P&gt;THY&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;47&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;THAY&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;1&lt;/P&gt;&lt;P&gt;THEY&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;3932&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;THAY&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;1&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 01 Nov 2011 20:09:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Speling-Korrecter/m-p/33176#M6444</guid>
      <dc:creator>FriedEgg</dc:creator>
      <dc:date>2011-11-01T20:09:38Z</dc:date>
    </item>
    <item>
      <title>Speling Korrecter</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Speling-Korrecter/m-p/33177#M6445</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi.&lt;/P&gt;&lt;P&gt;I know Bayes Formula very well.&lt;/P&gt;&lt;P&gt;As I said before , I agree with it at sometime ,but disagree with it at another sometime. Like other statistician.&lt;/P&gt;&lt;P&gt;For this case. Bayes 's calculation is simple because the distribution of variable's&amp;nbsp; value is disperse not continue.&lt;/P&gt;&lt;P&gt;If it were continuous ,then you need to use multiple integral to get its probability. That is very horrible.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;As far as I know Bayes Theory is very complimented at IT field.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Ksharp&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 02 Nov 2011 08:09:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Speling-Korrecter/m-p/33177#M6445</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2011-11-02T08:09:05Z</dc:date>
    </item>
    <item>
      <title>Speling Korrecter</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Speling-Korrecter/m-p/33178#M6446</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I happened to be looking at the seminars for the next SAS Global Forum and guess what one of the pre-conference statistical tutorials is...&amp;nbsp; Introduction to Bayesian Analysis Using SAS Software.&amp;nbsp; Humerous coincidence.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 03 Nov 2011 20:12:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Speling-Korrecter/m-p/33178#M6446</guid>
      <dc:creator>FriedEgg</dc:creator>
      <dc:date>2011-11-03T20:12:34Z</dc:date>
    </item>
    <item>
      <title>Re: Speling Korrecter</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Speling-Korrecter/m-p/33179#M6447</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;That is a good news. &lt;/P&gt;&lt;P&gt;But I can not go to there. while you can, it is a good opportunity for learning some Bayes Theory. &lt;/P&gt;&lt;P&gt;BTW, I have subscribed several papers to SAS Global Forum 2012.I don't know whether they will be acceptted.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Now I want to say some thing about your example above .&lt;/P&gt;&lt;P&gt;The word 'they' is vey high frequent word, so the probability of typing it wrong is very low. That is to say when&lt;/P&gt;&lt;P&gt;someone type 'thay' ,what he want maybe is 'thaw'.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The word people type wrong is usually strange or novel. So the word people actually want maybe is some word &lt;/P&gt;&lt;P&gt;which has low probability not high.&lt;/P&gt;&lt;P&gt;Like:&lt;/P&gt;&lt;P&gt;&lt;SPAN class="str"&gt;&lt;SUB&gt;&lt;EM&gt;economtric &lt;/EM&gt;&lt;/SUB&gt;&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;&lt;SUB&gt;&lt;EM&gt; &lt;/EM&gt;&lt;/SUB&gt;&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;&lt;SUB&gt;&lt;EM&gt;=&lt;/EM&gt;&lt;/SUB&gt;&lt;/SPAN&gt;&lt;SUB&gt;&lt;EM&gt;&amp;gt; &lt;SPAN class="pln"&gt; &lt;/SPAN&gt;&lt;SPAN class="str"&gt;econometric&lt;/SPAN&gt;&lt;/EM&gt;&lt;/SUB&gt;&lt;/P&gt;&lt;P&gt;v.s.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="str"&gt;&lt;SUB&gt;&lt;EM&gt;economtric &lt;/EM&gt;&lt;/SUB&gt;&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;&lt;SUB&gt;&lt;EM&gt; &lt;/EM&gt;&lt;/SUB&gt;&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;&lt;SUB&gt;&lt;EM&gt;=&lt;/EM&gt;&lt;/SUB&gt;&lt;/SPAN&gt;&lt;EM&gt;&amp;gt; &lt;SPAN class="pln"&gt; &lt;/SPAN&gt;&lt;/EM&gt;economic&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="str"&gt;So that is the reason I agree with it at sometime , another time disagree.&lt;/SPAN&gt;&lt;EM&gt;&lt;SPAN class="str"&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 04 Nov 2011 05:09:06 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Speling-Korrecter/m-p/33179#M6447</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2011-11-04T05:09:06Z</dc:date>
    </item>
  </channel>
</rss>

