<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Word list search in SAS Procedures</title>
    <link>https://communities.sas.com/t5/SAS-Procedures/Word-list-search/m-p/9887#M818</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;The restriction is with the number of characters that SAS allows for variable names, which cannot be longer than 32 characters. There is no way around that. Why would anyone want variable names longer than 32 chars? You could choose your own variable names by adding them to the WORDS dataset :&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;data words;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;length word $64 vname $27;&amp;nbsp; /* 27 = 32 - length("flag_") */&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;input word &amp;amp; score vname;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;word = lowcase(word);&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;datalines;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;able stuff&amp;nbsp;&amp;nbsp; 3 &lt;SPAN style="color: #800000;"&gt;able_stuff&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;best&amp;nbsp;&amp;nbsp; 4 &lt;SPAN style="color: #800000;"&gt;best&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;courage&amp;nbsp;&amp;nbsp; 5 &lt;SPAN style="color: #800000;"&gt;courage&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;proc sql;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;create table scores as&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;select phrase, word, vname, indexw(phrase, word) &amp;gt; 0 as wordPresent&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;from phrases, words&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;order by phrase, vname;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;quit;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;proc sql;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;create table scoresL as&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;select phrase, vname&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;from &lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; phrases inner join &lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; words on indexw(phrase, word) &amp;gt; 0&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;order by phrase, vname;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;quit;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Proc transpose data=scores out=scoresT(drop=_:) prefix=flag_;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;by phrase;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;var wordPresent;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;id vname;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;run;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;proc print noobs; run;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;You will get a proper dataset but will loose the exact expressions related to the columns. This is just one more reason to prefer the long over the wide dataset format.&lt;/P&gt;&lt;P&gt;PG&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Wed, 03 Jun 2015 17:33:16 GMT</pubDate>
    <dc:creator>PGStats</dc:creator>
    <dc:date>2015-06-03T17:33:16Z</dc:date>
    <item>
      <title>Word list search</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Word-list-search/m-p/9874#M805</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi all,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;i am a SAS beginner and my issue is the following: I have a long list of words and associated values (e.g., frequency in spoken language) in one dataset (Word_List) and another even longer dataset consisting of phrases(Phrase_List). I have to check whether each of the words from Word_List appears in each of the phrases in Phrase_List. If found, then the phrase is assigned the value of the word found.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;So far, I have only managed to do this for single words in the Phrase_List dataset using INDEXW. However, i would need tips on how to do this automatically from one dataset to the other. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;My code for single word search:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;data ref.phrase_score;&lt;/P&gt;&lt;P&gt;set ref.phrase_list;&lt;/P&gt;&lt;P&gt;score=0;&lt;/P&gt;&lt;P&gt;found=indexw(phrase,"able");&lt;/P&gt;&lt;P&gt;if found=0 then delete;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; else score+3.56;&lt;/P&gt;&lt;P&gt;keep phrase_ID phrase found score;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I would really appreciate tips on how to do this automatically for all words from the list and using these two separate files.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Dan&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 22 Feb 2012 11:47:03 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Word-list-search/m-p/9874#M805</guid>
      <dc:creator>dio</dc:creator>
      <dc:date>2012-02-22T11:47:03Z</dc:date>
    </item>
    <item>
      <title>Word list search</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Word-list-search/m-p/9875#M806</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;It seems to me you could use hash(). Following code base on assumptions: 1) word_list is in table (dataset), instead of macro variable, character variable or raw text. 2)table work_list share the same key variable "phrase" with your targeted table ref.phrase.list 3) 'score' is retained and sum up by 3.56 each time there is a match.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;data ref.phrase_score;&lt;/P&gt;&lt;P&gt;if 0 then set word_list;&lt;/P&gt;&lt;P&gt;declare hash _m(dataset: "word_list");&lt;/P&gt;&lt;P&gt;_m.definekey ('phrase');&lt;/P&gt;&lt;P&gt;_m.definedata('phrase');&lt;/P&gt;&lt;P&gt;_m.definedone();&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;do until (last);&lt;/P&gt;&lt;P&gt;set ref.phrase_list end=last;&lt;/P&gt;&lt;P&gt;if _m.find()=0 then do;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp; score+3.56;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp; output;&lt;/P&gt;&lt;P&gt;end;&lt;/P&gt;&lt;P&gt;end;&lt;/P&gt;&lt;P&gt;stop;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;You maybe able to do it using SQL as well, but I haven't figure out 'score' part of it. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;Haikuo&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 22 Feb 2012 13:08:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Word-list-search/m-p/9875#M806</guid>
      <dc:creator>Haikuo</dc:creator>
      <dc:date>2012-02-22T13:08:17Z</dc:date>
    </item>
    <item>
      <title>Word list search</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Word-list-search/m-p/9876#M807</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Dan,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;A perfect chore for a no longer documented, but still existing proc:&amp;nbsp; proc spell.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Take a look at: &lt;A href="http://www.sascommunity.org/wiki/Proc_spell"&gt;http://www.sascommunity.org/wiki/Proc_spell&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;If you need it, I still have a copy of the original documentation.&amp;nbsp; Since it is there, you can easily build the dictionary of words and proc spell can do all of the heavy work.&amp;nbsp; Then, you only have to merge the resulting file with your full word list that contains the values.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 22 Feb 2012 13:16:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Word-list-search/m-p/9876#M807</guid>
      <dc:creator>art297</dc:creator>
      <dc:date>2012-02-22T13:16:45Z</dc:date>
    </item>
    <item>
      <title>Word list search</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Word-list-search/m-p/9877#M808</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;What do you want as a result (score) when a word appears more than once in the phrase? - PG&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 22 Feb 2012 15:16:52 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Word-list-search/m-p/9877#M808</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2012-02-22T15:16:52Z</dc:date>
    </item>
    <item>
      <title>Word list search</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Word-list-search/m-p/9878#M809</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;What do these two dataset look like ? and output you want ?&lt;/P&gt;&lt;P&gt;You'd better give an example which can make your question clearer.&lt;/P&gt;&lt;P&gt;I think SQL or Hash can get it.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Ksharp&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 23 Feb 2012 02:50:57 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Word-list-search/m-p/9878#M809</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2012-02-23T02:50:57Z</dc:date>
    </item>
    <item>
      <title>Word list search</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Word-list-search/m-p/9879#M810</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;&lt;STRONG&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;A _jive_internal="true" href="https://communities.sas.com/message/117001#117001"&gt;Word list search&lt;/A&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/STRONG&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/P&gt;&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;thanks a lot for the tips! I have tried Proc Spell so far, which seems to work quite nicely, only that it delivers the frequencies of words from the lexicon that do NOT appear in the phrase list, whereas i need the words that do appear.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Example:&lt;/P&gt;&lt;P&gt;Word file= {"able"; "best";"courage"}. Every word in the list has an associated emotion rating (i am a psychologist) on a scale from 1-7: (able; 5.89); (best; 6.76); (courage; 6.30).&amp;nbsp; &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Phrase file = {"i feel at my best because i have been working on important goals for the past three days, and i've been able to complete them"; &lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; "i feel a bit disappointed that i wasn't as eager about it as everyone else but it was the best book i've read at least"; &lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; "i just hope we get to fix our relationship as best friends, it just takes a little bit of courage"}&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I want to check if each of the words is contained in each phrase, and then assign the word's score to the phrase if found. E.g., best and courage are both present in phrase 3, so the phrase_score for phrase 3 will be equal to 6.76 + 6.30 = 13.06. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Of course, it would be nice that if the word is found more than once in the same phrase, the phrase_score is incremented as well, but at moment i am trying to get this to work for one occurence per phrase.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Hash looks very interesting, tried it, but i seem to be doing something wrong there, because it delivers an output file with 0 observations. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;What I have done so far as a test:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;/* Count number of observations in ref.word_list and ref.phrase_list*/&lt;/P&gt;&lt;P&gt;PROC SQL;&lt;/P&gt;&lt;P&gt;SELECT COUNT(*), max(length(word)) INTO: WordNo, :Maxl FROM ref.word_list; &lt;/P&gt;&lt;P&gt;select count(*), max(length(phrase)) INTO: PhraseNo, :Maxlg FROM ref.phrase_list;&lt;/P&gt;&lt;P&gt;QUIT;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P style="text-align: left;"&gt;/* Load only 15 words from the list of words into a one-dimensional array, for testing purposes*/&lt;/P&gt;&lt;P&gt;data ref.solution;&lt;/P&gt;&lt;P&gt;set ref.phrase_list end=eoflist;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;length word1-word&amp;amp;WordNo $&amp;amp;Maxl;&lt;/P&gt;&lt;P&gt;array wordz {&amp;amp;WordNo} word1-word&amp;amp;WordNo;&lt;/P&gt;&lt;P&gt;retain word1-word&amp;amp;WordNo;&lt;/P&gt;&lt;P&gt;if _n_=1 then do iWord = 1 to 15;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; set ref.word_list;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; wordz{iWord} = word;&lt;/P&gt;&lt;P&gt;end;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;/* Look for each word in each phrase*/&lt;/P&gt;&lt;P&gt;score=0;&lt;/P&gt;&lt;P&gt;do iWord = 1 to 15;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; lag(score);&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; MatchW = indexw(phrase,wordz{iWord});&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; if MatchW &amp;gt; 0 then score+3;&lt;/P&gt;&lt;P&gt;end;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;keep phrase score;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Dan&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 23 Feb 2012 12:12:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Word-list-search/m-p/9879#M810</guid>
      <dc:creator>dio</dc:creator>
      <dc:date>2012-02-23T12:12:35Z</dc:date>
    </item>
    <item>
      <title>Word list search</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Word-list-search/m-p/9880#M811</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Then it is quite simple to do (if you don't need to count multiple occurences) :&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;data phrases;&lt;BR /&gt;length phrase $200; /* longer, if needed */&lt;BR /&gt;input;&lt;BR /&gt;phrase = trim(lowcase(_infile_));&lt;BR /&gt;datalines;&lt;BR /&gt;i feel at my best because i have been working on important goals for the past three days, and i've been able to complete them&lt;BR /&gt;i feel a bit disappointed that i wasn't as eager about it as everyone else but it was the best book i've read at least&lt;BR /&gt;i just hope we get to fix our relationship as best friends, it just takes a little bit of courage&lt;BR /&gt;;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;data words;&lt;BR /&gt;length word $16;&lt;BR /&gt;input word score;&lt;BR /&gt;word = lowcase(word);&lt;BR /&gt;datalines;&lt;BR /&gt;able 5.89&lt;BR /&gt;best 6.76&lt;BR /&gt;courage 6.30&lt;BR /&gt;;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;proc sql;&lt;BR /&gt;create table scores as&lt;BR /&gt;select phrase, sum((indexw(phrase, word)&amp;gt;0)*score) as phraseScore&lt;BR /&gt;from phrases, words&lt;BR /&gt;group by phrase;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;proc print; run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Good luck.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;PG&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 23 Feb 2012 15:44:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Word-list-search/m-p/9880#M811</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2012-02-23T15:44:32Z</dc:date>
    </item>
    <item>
      <title>Word list search</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Word-list-search/m-p/9881#M812</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Works like a charm, thanks a lot!&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 23 Feb 2012 17:42:14 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Word-list-search/m-p/9881#M812</guid>
      <dc:creator>dio</dc:creator>
      <dc:date>2012-02-23T17:42:14Z</dc:date>
    </item>
    <item>
      <title>Re: Word list search</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Word-list-search/m-p/9882#M813</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I apologize if this is a very old thread. I really like the sql code to accomplish the word searches here. But now I would like to do a search on a phrase instead of just words.&lt;/P&gt;&lt;P&gt;But, as you can see I am even having problems with that as my second phrase is not reading in the data correctly.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Please let me know of any suggestions. Thank you very much in advance.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;data phrases;&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;length phrase $200; /* longer, if needed */&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;input;&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;phrase = trim(lowcase(_infile_));&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;datalines;&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;phrase one able should be 0&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;phrase two able stuff should be 3&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;phrase three best should be 4&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;phrase four courage should be 5&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;phrase five able courage should be 5&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;phrase six able stuff courage should be 8&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;data words;&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;length word $16;&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;input word score;&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;word = lowcase(word);&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;datalines;&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;able stuff 3&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;best 4&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;courage 5&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;proc sql;&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;create table scores as&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;select phrase, sum((indexw(phrase, word)&amp;gt;0)*score) as phraseScore&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;from phrases, words&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;group by phrase;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;proc print; run;&lt;/SPAN&gt;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 01 Jun 2015 21:41:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Word-list-search/m-p/9882#M813</guid>
      <dc:creator>Zachary</dc:creator>
      <dc:date>2015-06-01T21:41:50Z</dc:date>
    </item>
    <item>
      <title>Re: Word list search</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Word-list-search/m-p/9883#M814</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Use the &amp;amp; input modifier to read in many words and make sure you separate the score from the words by at least two spaces :&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;data words;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;length word $16;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;input word &amp;amp; score;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;word = lowcase(word);&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;datalines;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;able stuff&amp;nbsp;&amp;nbsp; 3&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;best&amp;nbsp;&amp;nbsp; 4&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;courage&amp;nbsp;&amp;nbsp; 5&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The rest stays as is.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;PG&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 02 Jun 2015 01:59:54 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Word-list-search/m-p/9883#M814</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2015-06-02T01:59:54Z</dc:date>
    </item>
    <item>
      <title>Re: Word list search</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Word-list-search/m-p/9884#M815</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P style="text-align: left;"&gt;Thank you very much. It worked, and it worked well.&lt;/P&gt;&lt;P&gt;Now let me see if I can throw a curve ball into the mix. The code you provided does an excellent job of summating the scores associated with each of the words. Now, let us think about saying that the scores no longer matter. Now when it ultimately creates a table I would like to create columns for each word or phrase that it finds:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;TABLE border="1" class="jiveBorder" style="border: 1px solid rgb(0, 0, 0); width: 100%;"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TH style="text-align: center; background-color: #6690bc; color: #ffffff; padding: 2px;" valign="middle"&gt;&lt;STRONG&gt;Phrase&lt;/STRONG&gt;&lt;/TH&gt;&lt;TH style="text-align: center; background-color: #6690bc; color: #ffffff; padding: 2px;" valign="middle"&gt;&lt;STRONG&gt;Flag_able_stuff&lt;/STRONG&gt;&lt;/TH&gt;&lt;TH style="text-align: center; background-color: #6690bc; color: #ffffff; padding: 2px;" valign="middle"&gt;&lt;STRONG&gt;Flag_best&lt;/STRONG&gt;&lt;/TH&gt;&lt;TH style="text-align: center; background-color: #6690bc; color: #ffffff; padding: 2px;" valign="middle"&gt;&lt;STRONG&gt;Flag_courage&lt;/STRONG&gt;&lt;/TH&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD style="padding: 2px;"&gt;&lt;P&gt;phrase one able&lt;/P&gt;&lt;/TD&gt;&lt;TD style="padding: 2px;"&gt;&lt;/TD&gt;&lt;TD style="padding: 2px;"&gt;&lt;/TD&gt;&lt;TD style="padding: 2px;"&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD style="padding: 2px;"&gt;phrase two able stuff&lt;/TD&gt;&lt;TD style="padding: 2px; text-align: center;"&gt;1&lt;/TD&gt;&lt;TD style="padding: 2px;"&gt;&lt;/TD&gt;&lt;TD style="padding: 2px;"&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD style="padding: 2px;"&gt;phrase three best&lt;/TD&gt;&lt;TD style="padding: 2px;"&gt;&lt;/TD&gt;&lt;TD style="padding: 2px; text-align: center;"&gt;1&lt;/TD&gt;&lt;TD style="padding: 2px;"&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD style="padding: 2px;"&gt;phrase four courage&lt;/TD&gt;&lt;TD style="padding: 2px;"&gt;&lt;/TD&gt;&lt;TD style="padding: 2px;"&gt;&lt;/TD&gt;&lt;TD style="padding: 2px; text-align: center;"&gt;1&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD style="padding: 2px;"&gt;phrase five able courage&lt;/TD&gt;&lt;TD style="padding: 2px;"&gt;&lt;/TD&gt;&lt;TD style="padding: 2px;"&gt;&lt;/TD&gt;&lt;TD style="padding: 2px; text-align: center;"&gt;1&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD style="padding: 2px;"&gt;&lt;P&gt;phrase six able stuff courage&lt;/P&gt;&lt;/TD&gt;&lt;TD style="padding: 2px; text-align: center;"&gt;1&lt;/TD&gt;&lt;TD style="padding: 2px;"&gt;&lt;/TD&gt;&lt;TD style="padding: 2px; text-align: center;"&gt;1&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 12pt; font-family: arial,helvetica,sans-serif;"&gt;The newly created column for Flag_able_stuff had to put a _ in between the words to work.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 12pt; font-family: arial,helvetica,sans-serif;"&gt;Can a variation of your excellent SQL code do this? Or any other suggestions?&lt;/SPAN&gt;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 02 Jun 2015 15:05:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Word-list-search/m-p/9884#M815</guid>
      <dc:creator>Zachary</dc:creator>
      <dc:date>2015-06-02T15:05:48Z</dc:date>
    </item>
    <item>
      <title>Re: Word list search</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Word-list-search/m-p/9885#M816</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Will require a proc transpose step :&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;proc sql;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;create table scores as&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;select phrase, word, indexw(phrase, word) &amp;gt; 0 as wordPresent&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;from phrases, words&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;order by phrase, word;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;quit;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Proc transpose data=scores out=scoresT(drop=_:) prefix=flag_;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;by phrase;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;var wordPresent;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;id word;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;run;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;proc print data=scoresT noobs; run;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Note: I do not recommend the wide data structure of scoresT. A long version :&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;proc sql;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;create table scoresL as&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;select phrase, word&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;from &lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; phrases inner join &lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; words on indexw(phrase, word) &amp;gt; 0&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;order by phrase, word;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;quit;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;BR /&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;STRONG style="font-size: 13.3333330154419px;"&gt;proc print data=scoresL noobs; run;&lt;/STRONG&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;would be far more useful for most purposes.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;PG&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 02 Jun 2015 17:53:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Word-list-search/m-p/9885#M816</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2015-06-02T17:53:30Z</dc:date>
    </item>
    <item>
      <title>Re: Word list search</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Word-list-search/m-p/9886#M817</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Wow thanks. I think I am picking up that the Cartesian product is a beast in here.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Question - everything seems to be working, but I get many errors like the following:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The ID value "'flag_day qrc meet surgery ime ce'n" occurs twice in the same BY group&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I am assuming that is because my character lengths are too long. But it is sort of necessary for what I am doing.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Will there be a limit as to the number of characters I may use, or may I extend the default in SQL?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 03 Jun 2015 16:21:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Word-list-search/m-p/9886#M817</guid>
      <dc:creator>Zachary</dc:creator>
      <dc:date>2015-06-03T16:21:22Z</dc:date>
    </item>
    <item>
      <title>Re: Word list search</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Word-list-search/m-p/9887#M818</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;The restriction is with the number of characters that SAS allows for variable names, which cannot be longer than 32 characters. There is no way around that. Why would anyone want variable names longer than 32 chars? You could choose your own variable names by adding them to the WORDS dataset :&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;data words;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;length word $64 vname $27;&amp;nbsp; /* 27 = 32 - length("flag_") */&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;input word &amp;amp; score vname;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;word = lowcase(word);&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;datalines;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;able stuff&amp;nbsp;&amp;nbsp; 3 &lt;SPAN style="color: #800000;"&gt;able_stuff&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;best&amp;nbsp;&amp;nbsp; 4 &lt;SPAN style="color: #800000;"&gt;best&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;courage&amp;nbsp;&amp;nbsp; 5 &lt;SPAN style="color: #800000;"&gt;courage&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;proc sql;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;create table scores as&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;select phrase, word, vname, indexw(phrase, word) &amp;gt; 0 as wordPresent&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;from phrases, words&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;order by phrase, vname;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;quit;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;proc sql;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;create table scoresL as&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;select phrase, vname&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;from &lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; phrases inner join &lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; words on indexw(phrase, word) &amp;gt; 0&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;order by phrase, vname;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;quit;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Proc transpose data=scores out=scoresT(drop=_:) prefix=flag_;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;by phrase;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;var wordPresent;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;id vname;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;run;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;proc print noobs; run;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;You will get a proper dataset but will loose the exact expressions related to the columns. This is just one more reason to prefer the long over the wide dataset format.&lt;/P&gt;&lt;P&gt;PG&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 03 Jun 2015 17:33:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Word-list-search/m-p/9887#M818</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2015-06-03T17:33:16Z</dc:date>
    </item>
    <item>
      <title>Re: Word list search</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Word-list-search/m-p/9888#M819</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Thank you again PGStats. Once again I have been humbled by what I am trying to accomplish. I have one more idea brewing here.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;But, a little background.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I have over 1,000,000 million comments used by our insurance company. The goal with part of our predictive modeling is to see if any key words or phrases are predictive of future reserve changes and other indices.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;We began the process by doing a Text Analysis within Enerprise Miner. From what I have seen this is a very useful tool in itself by creating clusters and/or factors that academically would be very useful.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I am trying to take the words and phrases that come out as statisticially significant- and provide further analyses to see if they are statistically meaningful. My goal was to collect hundreds - even thousands of these key words and phrases - then let the bivariate indices such as correlations be our "stepping stone" for figuring if they would be worthy of inclusion in a decision tree model. Ideally I would like to find anywhere from a few dozen to over one hundred words &amp;amp; phrases that could potentially predict reserve changes. Of course these would not be the main predictors.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Any suggestions are welcome. Right now I have about a half a dozen different ideas brewing on how I wish to proceed. But no worries about the 1 million records. I have already pulled a random sample of 5,000 for working purposes.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 03 Jun 2015 17:56:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Word-list-search/m-p/9888#M819</guid>
      <dc:creator>Zachary</dc:creator>
      <dc:date>2015-06-03T17:56:19Z</dc:date>
    </item>
    <item>
      <title>Re: Word list search</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Word-list-search/m-p/9889#M820</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Please post your new idea as a new discussion...&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;PG&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 03 Jun 2015 18:51:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Word-list-search/m-p/9889#M820</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2015-06-03T18:51:15Z</dc:date>
    </item>
    <item>
      <title>Re: Word list search</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Word-list-search/m-p/9890#M821</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I am very close to a final solution:. I think I can take care of it with a lot of labels:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;proc sql;&lt;BR /&gt;&amp;nbsp; create table scoresL as&lt;BR /&gt;&amp;nbsp; select CLAIMNO, COMMENTTEXT, Text4Matching, NumberForLabels&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; from &lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; SRS_Comments500 inner join &lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; TextFromExcel on indexw(COMMENTTEXT, Text4Matching)&amp;gt;0&lt;BR /&gt;&amp;nbsp; order by CLAIMNO, COMMENTTEXT, Text4Matching;&lt;BR /&gt;quit;&lt;BR /&gt;proc print data=scoresL noobs; run;&lt;/P&gt;&lt;P&gt;proc transpose data=scoresL out=FlaggedCommentsT(drop=_:) prefix=flag_;&lt;BR /&gt;&amp;nbsp; by CLAIMNO;&lt;BR /&gt;&amp;nbsp; id NumberForLabels;&lt;BR /&gt;run;&lt;BR /&gt;proc print data=FlaggedCommentsT noobs; run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I just need to create a routine to label each of the numbered fields.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;But, I am getting lots of errors involved with this:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The ID value, "flag_6" occurs twice in the same BY group.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I cannot figure what is happening here. Any suggestions?&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 03 Jun 2015 20:12:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Word-list-search/m-p/9890#M821</guid>
      <dc:creator>Zachary</dc:creator>
      <dc:date>2015-06-03T20:12:35Z</dc:date>
    </item>
    <item>
      <title>Re: Word list search</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Word-list-search/m-p/9891#M822</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Sorry - did not see your response there...&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 03 Jun 2015 20:23:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Word-list-search/m-p/9891#M822</guid>
      <dc:creator>Zachary</dc:creator>
      <dc:date>2015-06-03T20:23:15Z</dc:date>
    </item>
    <item>
      <title>Re: Word list search</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Word-list-search/m-p/9892#M823</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Have you tried to use the tables that&amp;nbsp; are created by the Text Mining Nodes? Those create exactly the same tables (usually in long format, not transposed).&lt;/P&gt;&lt;P&gt;Also, if you have ideas, I would suggest posting into:&amp;nbsp; "&lt;A _jive_internal="true" class="js-target-container" data-objectid="2022" data-objecttype="14" href="https://communities.sas.com/choose-container.jspa?contentType=1&amp;amp;containerType=14&amp;amp;container=2022&amp;amp;upload=false" style="font-size: 13.6499996185303px; font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; color: #0e66ba;"&gt;Text and Content Analytics&lt;/A&gt;" forum. (I wonder what text miners think about measuring pairwise correlations and selecting the best ~100 variables for predictive modeling.)&lt;/P&gt;&lt;P&gt;If you have technical questions (like this one), this forum is adequate.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 03 Jun 2015 20:33:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Word-list-search/m-p/9892#M823</guid>
      <dc:creator>gergely_batho</dc:creator>
      <dc:date>2015-06-03T20:33:12Z</dc:date>
    </item>
    <item>
      <title>Re: Word list search</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Word-list-search/m-p/9893#M824</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Thank you Gergely. Yes, I have tried using the tables created using the Text Mining Nodes.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The unfortunate part about the Tex and Content Analytics forum is that there is not usually a lot of traffic in there.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;But I promise to keep my questions within this forum on the technical side. Thank you so much!&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 03 Jun 2015 20:46:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Word-list-search/m-p/9893#M824</guid>
      <dc:creator>Zachary</dc:creator>
      <dc:date>2015-06-03T20:46:32Z</dc:date>
    </item>
  </channel>
</rss>

