<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: web crawler for finding multiple instances of the same keyword in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/web-crawler-for-finding-multiple-instances-of-the-same-keyword/m-p/169831#M32597</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;@sonikm24&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Some time ago it was important to highlight issues relevant to y2k compliance. To show the context of issues my code buffered program lines in blocks controlled by a macro var (I started with 3 but client needed 5). The code used ARRAYs to buffer the lines of code. You might have a similar concern that there are multiple strings to target and these must be allowed to overlap.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;the code was not concise.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;best of luck with your challenge&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;peterC&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Mon, 07 Apr 2014 14:37:05 GMT</pubDate>
    <dc:creator>Peter_C</dc:creator>
    <dc:date>2014-04-07T14:37:05Z</dc:date>
    <item>
      <title>web crawler for finding multiple instances of the same keyword</title>
      <link>https://communities.sas.com/t5/SAS-Programming/web-crawler-for-finding-multiple-instances-of-the-same-keyword/m-p/169829#M32595</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; I have using a web crawler to find multiple instances of keywords (in the code I am searching for "Notional") from sec files (see the code attached). I am using the prxnext function to do the job. But I am having a problem when I am trying to output lines surrounding the keywords. I am trying to increase the output lines for every instance of the keyword in the sec file. E.g. if there are 5 instances of "Notional" in the sec file, i am trying to output lines surrounding each one of the instances of the keyword. In the code, I am using the following lines of code for that purpose:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; if (0 &amp;lt; countC2 &amp;lt;= 10) then do;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; output;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; end;&lt;/P&gt;&lt;P&gt;But this code is not able to increase or decrease the output lines surrounding the keywords even by changing 10 to 15 or 5. Can anyone help with the issue? I have attached the code and a sample excel file.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks.&lt;/P&gt;&lt;P&gt;Sonik Mandal&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sat, 05 Apr 2014 02:55:06 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/web-crawler-for-finding-multiple-instances-of-the-same-keyword/m-p/169829#M32595</guid>
      <dc:creator>sonikm24</dc:creator>
      <dc:date>2014-04-05T02:55:06Z</dc:date>
    </item>
    <item>
      <title>Re: web crawler for finding multiple instances of the same keyword</title>
      <link>https://communities.sas.com/t5/SAS-Programming/web-crawler-for-finding-multiple-instances-of-the-same-keyword/m-p/169830#M32596</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;If I understand your problem, which I'm not sure I do, you can't simply change a single parameter in the code you have to get extra lines. &lt;/P&gt;&lt;P&gt;SAS processes data lines by line, so its more complex than that. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I don't usually say this, but I question whether SAS is the best job for this type of work. Not that it can't be done, more of a should it.&lt;/P&gt;&lt;P&gt;The Kimono interface is fairly good:&lt;/P&gt;&lt;P&gt;&lt;A href="http://www.kimonolabs.com/welcome.html" style="font-size: 10pt; line-height: 1.5em;" title="http://www.kimonolabs.com/welcome.html"&gt;the kimono blog&lt;/A&gt;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sat, 05 Apr 2014 03:48:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/web-crawler-for-finding-multiple-instances-of-the-same-keyword/m-p/169830#M32596</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2014-04-05T03:48:22Z</dc:date>
    </item>
    <item>
      <title>Re: web crawler for finding multiple instances of the same keyword</title>
      <link>https://communities.sas.com/t5/SAS-Programming/web-crawler-for-finding-multiple-instances-of-the-same-keyword/m-p/169831#M32597</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;@sonikm24&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Some time ago it was important to highlight issues relevant to y2k compliance. To show the context of issues my code buffered program lines in blocks controlled by a macro var (I started with 3 but client needed 5). The code used ARRAYs to buffer the lines of code. You might have a similar concern that there are multiple strings to target and these must be allowed to overlap.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;the code was not concise.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;best of luck with your challenge&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;peterC&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 07 Apr 2014 14:37:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/web-crawler-for-finding-multiple-instances-of-the-same-keyword/m-p/169831#M32597</guid>
      <dc:creator>Peter_C</dc:creator>
      <dc:date>2014-04-07T14:37:05Z</dc:date>
    </item>
    <item>
      <title>Re: web crawler for finding multiple instances of the same keyword</title>
      <link>https://communities.sas.com/t5/SAS-Programming/web-crawler-for-finding-multiple-instances-of-the-same-keyword/m-p/169832#M32598</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;It looks like you already have a SAS data set by the time you search for NOTIONAL.&amp;nbsp; In that case, finding 5 lines doesn't have to be terribly difficult.&amp;nbsp; You might have decisions to make if you find NOTIONAL on the first line (for example) ... this solution would take a maximum of 5 lines:&amp;nbsp; the line itself, plus 2 before and 2 after (assuming that those lines actually exist).&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 12pt;"&gt;&lt;SPAN lang="EN"&gt;&amp;nbsp; &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;data SiteVisitnew;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; Set SiteVisitnew nobs=_total_obs_;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 12pt;"&gt;&lt;SPAN lang="EN"&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; patternID = prxparse('/NOTIONAL/');&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; if patternID then do j=max(1, _n_-2) to min(_total_obs_, _n_+2);&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; set SiteVisitnew point=j;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; output;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp; end;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp; drop j;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I hope I selected properly based on patternID, but that would be easy to fix if it's wrong.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Note that the same line might be selected twice, if NOTIONAL appears twice in close proximity.&amp;nbsp; There are ways to handle that, but you would have to define first what "handling that" actually means.&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 12pt;"&gt;&lt;SPAN lang="EN"&gt;&lt;/SPAN&gt;&lt;/SPAN&gt; &lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 07 Apr 2014 14:49:36 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/web-crawler-for-finding-multiple-instances-of-the-same-keyword/m-p/169832#M32598</guid>
      <dc:creator>Astounding</dc:creator>
      <dc:date>2014-04-07T14:49:36Z</dc:date>
    </item>
    <item>
      <title>Re: web crawler for finding multiple instances of the same keyword</title>
      <link>https://communities.sas.com/t5/SAS-Programming/web-crawler-for-finding-multiple-instances-of-the-same-keyword/m-p/169833#M32599</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hello @Astounding, &lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; I inserted the snippet of code that you mentioned in your above message to my sas code. I have attached the integrated code with the mail for your reference (and also a excel file to test). But when I am running the code, the full sec file is getting returned, and not the required code lines.&lt;/P&gt;&lt;P&gt;Please let me know if I am doing any mistake adding your part of code to my code (I have just commented the data SiteVisitnew part on my code and added your code)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks.&lt;/P&gt;&lt;P&gt;Sonik Mandal&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 10 Apr 2014 21:16:52 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/web-crawler-for-finding-multiple-instances-of-the-same-keyword/m-p/169833#M32599</guid>
      <dc:creator>sonikm24</dc:creator>
      <dc:date>2014-04-10T21:16:52Z</dc:date>
    </item>
  </channel>
</rss>

