<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: measure distance between two words in a text string in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/measure-distance-between-two-words-in-a-text-string/m-p/266951#M52713</link>
    <description>Hi there, First of all, I want to thank you for your kind reply. Using your code I have successfully measured distance between two specific words in number of words. data _null_; xyz='She was prescribed exercise and drug. You may visit next week to take further advice about medicine as well as diet'; *search for the word she; First = 'medicine' ; Second = 'diet' ; Firstword=.; Secondword=.; worddistance=.; do i = 1 to (countw(xyz)); if missing(Firstword) and upcase(First) = upcase(Scan(xyz,i)) then FirstWord=i; if missing(Secondword) and upcase(Second) = upcase(Scan(xyz,i)) then Secondword=i; end; if Firstword lt Secondword; worddistance= SecondWord-Firstword; put worddistance=; put Secondword = Firstword=; run; Once again thank you for raising some questions which are very relevant to my analysis. To initiate discussion of the issue, I tried to keep it as simple as possible. * The words are case insensitive. * If the first word come after the second word, it can be filtered from flagging/analysis by using if Firstword lt Secondword; * if only one of the two words are present, it will be automatically filtered from flagging/analysis and is desired too. * Now the issue remaining to be addressed is the calculation of distance when words are occurring multiple times: for e.g. xyz='She was prescribed exercise and diet. You may visit next week to take further advice about medicine as well as diet'; The above code is not working. The word 'diet ' is occurring twice. The code measures the distance for the first "diet" and not for the second "diet". Again the condition i.e. second word should always be next to first word to measure the distance also expels it. Once again, thank you in advance for your kind guidance. Regards, Deepak</description>
    <pubDate>Thu, 28 Apr 2016 14:07:00 GMT</pubDate>
    <dc:creator>DeepakSwain</dc:creator>
    <dc:date>2016-04-28T14:07:00Z</dc:date>
    <item>
      <title>measure distance between two words in a text string</title>
      <link>https://communities.sas.com/t5/SAS-Programming/measure-distance-between-two-words-in-a-text-string/m-p/266854#M52686</link>
      <description>&lt;P&gt;I am intersted to measure distance between 2 specific words in a text string in&amp;nbsp; term of &lt;FONT color="#ff0000"&gt;number of words &lt;/FONT&gt;in between them.&lt;/P&gt;&lt;P&gt;Most of the functions I am aware of are providing me distance in term of&lt;FONT color="#ff0000"&gt; number of characters&lt;/FONT&gt; such as:&lt;/P&gt;&lt;P align="left"&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;FONT size="2"&gt;data _null_; &lt;/FONT&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;searchhere='residential treatment facility';&lt;/P&gt;&lt;P&gt;fullword=&lt;FONT color="#ff0000"&gt;indexw&lt;/FONT&gt;(searchhere,'treatment');&lt;/P&gt;&lt;P&gt;put fullword=;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;FONT size="2"&gt;data _null_; &lt;/FONT&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;xyz='She sells seashells? Yes, she does.'; *search for the word she;&lt;/P&gt;&lt;P&gt;whereisShe=&lt;FONT color="#ff0000"&gt;findw&lt;/FONT&gt;(xyz,'she');&lt;/P&gt;&lt;P&gt;put whereisShe;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;N.B: I am looking for distance i.e. number of words&amp;nbsp;between &lt;FONT color="#ff0000"&gt;'sick'&lt;/FONT&gt; and &lt;FONT color="#ff0000"&gt;'antibiotics'&lt;/FONT&gt; in the string: &lt;FONT color="#ff0000"&gt;Very sick people may only take antibiotics. &lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you in advance for your kind reply.&lt;BR /&gt;Deepak&lt;/P&gt;</description>
      <pubDate>Wed, 27 Apr 2016 20:35:43 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/measure-distance-between-two-words-in-a-text-string/m-p/266854#M52686</guid>
      <dc:creator>DeepakSwain</dc:creator>
      <dc:date>2016-04-27T20:35:43Z</dc:date>
    </item>
    <item>
      <title>Re: measure distance between two words in a text string</title>
      <link>https://communities.sas.com/t5/SAS-Programming/measure-distance-between-two-words-in-a-text-string/m-p/266862#M52689</link>
      <description>&lt;P&gt;What if one of the words is repeated? Which count would you want?&lt;/P&gt;
&lt;P&gt;What if both words appear in the string multiple times?&lt;/P&gt;
&lt;P&gt;What if the "first" word actually occurs after the "second" word?&lt;/P&gt;
&lt;P&gt;Is the search to be Case sensitive? Is "Sick" to match "sick" (I assume yes, but should clarify)&lt;/P&gt;
&lt;P&gt;What happens when only one of the words matches?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You may also have to look at delimeters between works does a dash in a compound word qualify? Would sick-bed count as "sick"?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;A stub of some code that may work that matches the FIRST occurence of a word and matches regardless of case.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data _null_; 

xyz='She sells seashells? Yes, she does.'; *search for the word she;
First = 'She'  ;
Second = 'does' ;
Firstword=.; 
Secondword=.;
do i = 1 to (countw(xyz));
   if missing(Firstword) and upcase(First) = upcase(Scan(xyz,i)) then FirstWord=i;
   if missing(Secondword) and upcase(Second) = upcase(Scan(xyz,i)) then Secondword=i;
end;

put Firstword= SecondWord=;

run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 27 Apr 2016 20:54:51 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/measure-distance-between-two-words-in-a-text-string/m-p/266862#M52689</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2016-04-27T20:54:51Z</dc:date>
    </item>
    <item>
      <title>Re: measure distance between two words in a text string</title>
      <link>https://communities.sas.com/t5/SAS-Programming/measure-distance-between-two-words-in-a-text-string/m-p/266951#M52713</link>
      <description>Hi there, First of all, I want to thank you for your kind reply. Using your code I have successfully measured distance between two specific words in number of words. data _null_; xyz='She was prescribed exercise and drug. You may visit next week to take further advice about medicine as well as diet'; *search for the word she; First = 'medicine' ; Second = 'diet' ; Firstword=.; Secondword=.; worddistance=.; do i = 1 to (countw(xyz)); if missing(Firstword) and upcase(First) = upcase(Scan(xyz,i)) then FirstWord=i; if missing(Secondword) and upcase(Second) = upcase(Scan(xyz,i)) then Secondword=i; end; if Firstword lt Secondword; worddistance= SecondWord-Firstword; put worddistance=; put Secondword = Firstword=; run; Once again thank you for raising some questions which are very relevant to my analysis. To initiate discussion of the issue, I tried to keep it as simple as possible. * The words are case insensitive. * If the first word come after the second word, it can be filtered from flagging/analysis by using if Firstword lt Secondword; * if only one of the two words are present, it will be automatically filtered from flagging/analysis and is desired too. * Now the issue remaining to be addressed is the calculation of distance when words are occurring multiple times: for e.g. xyz='She was prescribed exercise and diet. You may visit next week to take further advice about medicine as well as diet'; The above code is not working. The word 'diet ' is occurring twice. The code measures the distance for the first "diet" and not for the second "diet". Again the condition i.e. second word should always be next to first word to measure the distance also expels it. Once again, thank you in advance for your kind guidance. Regards, Deepak</description>
      <pubDate>Thu, 28 Apr 2016 14:07:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/measure-distance-between-two-words-in-a-text-string/m-p/266951#M52713</guid>
      <dc:creator>DeepakSwain</dc:creator>
      <dc:date>2016-04-28T14:07:00Z</dc:date>
    </item>
    <item>
      <title>Re: measure distance between two words in a text string</title>
      <link>https://communities.sas.com/t5/SAS-Programming/measure-distance-between-two-words-in-a-text-string/m-p/267029#M52718</link>
      <description>&lt;P&gt;You could exend the logic about finding words multiple times but you'll still need to make some assumptions and decisions.&lt;/P&gt;
&lt;P&gt;For instance you can find out how many times the specific words occur and then using an array store the positions for first, second, etc occurence for each word.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This demonstrates getting those values.&lt;/P&gt;
&lt;P&gt;You will need to decide your logic on getting which comparisons of the positions you want.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data _null_; 

   xyz='She sells seashells? Yes, she does.'; *search for the word she;
   First = 'She'  ;
   Second = 'does' ;
   array firsts (4)  f1-f4; /*assumes 1) that the first word won't occur more than 4 times*/
   Array seconds (4) s1-s4; 
   Findex=1;/* these index variables will point to where to store the word count in the arrays*/
   Sindex=1;
   do i = 1 to (countw(xyz));
      if upcase(First) = upcase(Scan(xyz,i)) then do;
         Firsts[Findex] = i;
         Findex = Findex+1;
      end;
      if upcase(Second) = upcase(Scan(xyz,i)) then do;
         Seconds[Sindex]=i;
         Sindex = Sindex +1;
      end;
   end;

   do i = 1 to (n(of Firsts(*)));
      put First "occurs in position" +1 Firsts[i] +(-1) '.' @;
      do j = 1 to (n(of seconds(*)));
         put +1 second "occurs in position" +1 seconds[j];
      end;
      put;
   end;

run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 28 Apr 2016 16:22:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/measure-distance-between-two-words-in-a-text-string/m-p/267029#M52718</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2016-04-28T16:22:29Z</dc:date>
    </item>
  </channel>
</rss>

