<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic PROC HTTP for multiple pages in SAS Procedures</title>
    <link>https://communities.sas.com/t5/SAS-Procedures/PROC-HTTP-for-multiple-pages/m-p/17483#M3320</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I'm not familiar with the site, and friedegg is definitely better at this than I am, but I presume you are submitting something like: __doPostBack('datagrid_results$_ctl44$_ctl0','')&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;the last part is indicating which of the (what appear to be standard) 40 pages you want, which appear to be numbered from ctl0 to ctl39.&amp;nbsp; You could just submit 40 of those statements numbers from ctl0 to ctl39.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Mon, 19 Dec 2011 23:04:12 GMT</pubDate>
    <dc:creator>art297</dc:creator>
    <dc:date>2011-12-19T23:04:12Z</dc:date>
    <item>
      <title>PROC HTTP for multiple pages</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/PROC-HTTP-for-multiple-pages/m-p/17482#M3319</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I am using PROC HTTP to pull the following site: &lt;A href="http://app.hpla.doh.dc.gov/weblookup/Search.aspx"&gt;http://app.hpla.doh.dc.gov/weblookup/Search.aspx&lt;/A&gt;.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The challenge is that they break up their results on multiple pages.&amp;nbsp; Is there an easy way to tell proc http just to grab it all?&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 19 Dec 2011 18:21:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/PROC-HTTP-for-multiple-pages/m-p/17482#M3319</guid>
      <dc:creator>Jay1</dc:creator>
      <dc:date>2011-12-19T18:21:46Z</dc:date>
    </item>
    <item>
      <title>PROC HTTP for multiple pages</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/PROC-HTTP-for-multiple-pages/m-p/17483#M3320</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I'm not familiar with the site, and friedegg is definitely better at this than I am, but I presume you are submitting something like: __doPostBack('datagrid_results$_ctl44$_ctl0','')&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;the last part is indicating which of the (what appear to be standard) 40 pages you want, which appear to be numbered from ctl0 to ctl39.&amp;nbsp; You could just submit 40 of those statements numbers from ctl0 to ctl39.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 19 Dec 2011 23:04:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/PROC-HTTP-for-multiple-pages/m-p/17483#M3320</guid>
      <dc:creator>art297</dc:creator>
      <dc:date>2011-12-19T23:04:12Z</dc:date>
    </item>
    <item>
      <title>PROC HTTP for multiple pages</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/PROC-HTTP-for-multiple-pages/m-p/17484#M3321</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt; I want to be automate it though.&amp;nbsp; As they add people, the number of pages will grow.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Also is it possible for some sites to tell the difference between PROC HTTP and firefox?&amp;nbsp; The following code seems to get blocked:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;filename in&amp;nbsp; '...';&lt;/P&gt;&lt;P&gt;filename out '...';&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;DATA _NULL_;&lt;/P&gt;&lt;P&gt;y='__VIEWSTATE=%2FwEPDwUKLTMzOTcyOTAwMGRkvgRbuwnv4KbGV9NO1ykbEZBjrSg%3D&amp;amp;__EVENTVALIDATION=%2FwEWCwLCgKrWCgLkw8LZCALkw87ZCALkw8rZCAK31u2PDQLWqM3QDgL%2B%2FupqAqGPgsIEAojE0dMMArfW1a4MAq3qmaYPyupjM75%2FOoF7dcmGrwJNpMcfXWA%3D&amp;amp;ctl00%24PageContent%24SSN1=&amp;amp;ctl00%24PageContent%24SSN2=&amp;amp;ctl00%24PageContent%24SSN3=&amp;amp;ctl00%24PageContent%24fname=&amp;amp;ctl00%24PageContent%24mname=&amp;amp;ctl00%24PageContent%24lname=a&amp;amp;ctl00%24PageContent%24btnSubmit2=Submit';&lt;/P&gt;&lt;P&gt;file in lrecl=475;&lt;/P&gt;&lt;P&gt;put y;&lt;/P&gt;&lt;P&gt;RUN;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;PROC HTTP&lt;/P&gt;&lt;P&gt;in=in&lt;/P&gt;&lt;P&gt;out=out&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;url='&lt;/SPAN&gt;&lt;A class="jive-link-external-small" href="http://health.state.tn.us/AbuseRegistry/default.aspx"&gt;http://health.state.tn.us/AbuseRegistry/default.aspx&lt;/A&gt;&lt;SPAN&gt;'&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;method='POST'&lt;/P&gt;&lt;P&gt;ct='application/x-www-form-urlencoded';&lt;/P&gt;&lt;P&gt;RUN;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 20 Dec 2011 15:51:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/PROC-HTTP-for-multiple-pages/m-p/17484#M3321</guid>
      <dc:creator>Jay1</dc:creator>
      <dc:date>2011-12-20T15:51:05Z</dc:date>
    </item>
    <item>
      <title>PROC HTTP for multiple pages</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/PROC-HTTP-for-multiple-pages/m-p/17485#M3322</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Scraping dynamic forms from ASP.net applications is difficult and SAS does not really have a lot of the tools you need.&amp;nbsp; Also eventually when the sites you are scraping replace their encryption keys nothing will work anymore.&amp;nbsp; With the large number of scrapes you are trying to accomplish I would recommend using an outside tool built specifically for what you are trying to do.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;If you did want to do this in SAS what you need to do is decode what exactly the javascript call __doPostBack is doing.&amp;nbsp; It is probably performing a new post.&amp;nbsp; Hopefully it still uses the same viewstate and evenvalidation pieces.&amp;nbsp; The process would be to make a call to the initial search results.&amp;nbsp; Gather your data, and check for a javascript link to a subsequent page.&amp;nbsp; If it exists make a new call to the subsequent datagrid location, and so on, in a loop.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Best of luck, if I were you I would move outside of SAS to perform these heavy scraping tasks (which you should probably confirm the legality of).&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 20 Dec 2011 19:11:07 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/PROC-HTTP-for-multiple-pages/m-p/17485#M3322</guid>
      <dc:creator>FriedEgg</dc:creator>
      <dc:date>2011-12-20T19:11:07Z</dc:date>
    </item>
  </channel>
</rss>

