<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic How to scrape websites that require login credentials using proc http? in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/How-to-scrape-websites-that-require-login-credentials-using-proc/m-p/763784#M241889</link>
    <description>&lt;P&gt;Hi everyone , I am trying to scrape a &lt;A href="https://www.pressreader.com/catalog" target="_blank" rel="noopener"&gt;website&lt;/A&gt; which requires subscription . I am using proc http to extract source html code and then using SAS character functions to extract the information I require , however I am getting email not verified in the source html code for this particular website . Attaching the proc http code I am using below .&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;filename dest "location";
proc http
	url = "https://www.pressreader.com/catalog"
	out = dest
	method = "GET" 
	webusername="XXXX"
	webpassword="XXXX"
	auth_basic;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;.&lt;/P&gt;</description>
    <pubDate>Wed, 25 Aug 2021 05:19:45 GMT</pubDate>
    <dc:creator>kaziumair</dc:creator>
    <dc:date>2021-08-25T05:19:45Z</dc:date>
    <item>
      <title>How to scrape websites that require login credentials using proc http?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-scrape-websites-that-require-login-credentials-using-proc/m-p/763784#M241889</link>
      <description>&lt;P&gt;Hi everyone , I am trying to scrape a &lt;A href="https://www.pressreader.com/catalog" target="_blank" rel="noopener"&gt;website&lt;/A&gt; which requires subscription . I am using proc http to extract source html code and then using SAS character functions to extract the information I require , however I am getting email not verified in the source html code for this particular website . Attaching the proc http code I am using below .&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;filename dest "location";
proc http
	url = "https://www.pressreader.com/catalog"
	out = dest
	method = "GET" 
	webusername="XXXX"
	webpassword="XXXX"
	auth_basic;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;.&lt;/P&gt;</description>
      <pubDate>Wed, 25 Aug 2021 05:19:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-scrape-websites-that-require-login-credentials-using-proc/m-p/763784#M241889</guid>
      <dc:creator>kaziumair</dc:creator>
      <dc:date>2021-08-25T05:19:45Z</dc:date>
    </item>
    <item>
      <title>Re: How to scrape websites that require login credentials using proc http?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-scrape-websites-that-require-login-credentials-using-proc/m-p/763824#M241911</link>
      <description>&lt;P&gt;Some sites are designed to be interactive and will provide content only in a browser that runs Javascript, serving content when a user is browsing the page.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Also, this site supports "native" accounts as well as social sign-in. If you used a social account like Facebook or Google, then those credentials would likely not work from a script like PROC HTTP or cURL.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;It appears that &lt;A href="https://developers.pressreader.com/" target="_self"&gt;PressReader.com offers an API&lt;/A&gt;. This would be a much more reliable method for pulling data from the site. Requires an API account to get a token, but not sure if there is a cost. For information about using PROC HTTP with APIs like this, &lt;A href="https://communities.sas.com/t5/Ask-the-Expert/How-Do-You-Use-SAS-to-Access-Data-and-APIs-From-the-Web-Q-amp-A/ta-p/699613" target="_self"&gt;see this Ask the Expert session&lt;/A&gt;.&lt;/P&gt;</description>
      <pubDate>Wed, 25 Aug 2021 12:06:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-scrape-websites-that-require-login-credentials-using-proc/m-p/763824#M241911</guid>
      <dc:creator>ChrisHemedinger</dc:creator>
      <dc:date>2021-08-25T12:06:12Z</dc:date>
    </item>
    <item>
      <title>Re: How to scrape websites that require login credentials using proc http?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-scrape-websites-that-require-login-credentials-using-proc/m-p/763834#M241916</link>
      <description>&lt;P&gt;Hi Chris ,&lt;/P&gt;
&lt;P&gt;Thanks for your help , I will definitely check the API. In the code , I have used native account credentials maybe the website is interactive and requires user browsing as you suggested.&lt;/P&gt;
&lt;P&gt;Could you please confirm whether the code I used is the correct way of using proc http for websites that require login credentials?&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 25 Aug 2021 13:33:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-scrape-websites-that-require-login-credentials-using-proc/m-p/763834#M241916</guid>
      <dc:creator>kaziumair</dc:creator>
      <dc:date>2021-08-25T13:33:30Z</dc:date>
    </item>
    <item>
      <title>Re: How to scrape websites that require login credentials using proc http?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-scrape-websites-that-require-login-credentials-using-proc/m-p/763842#M241918</link>
      <description>&lt;P&gt;Yes, that's the correct method for basic authentication (user / password). However, many websites use other types of authentication including OAuth or some other token, and providing your user/pw is just a way to get that token, which the website will manage automatically for further browsing/requests.&lt;/P&gt;</description>
      <pubDate>Wed, 25 Aug 2021 14:09:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-scrape-websites-that-require-login-credentials-using-proc/m-p/763842#M241918</guid>
      <dc:creator>ChrisHemedinger</dc:creator>
      <dc:date>2021-08-25T14:09:19Z</dc:date>
    </item>
    <item>
      <title>Re: How to scrape websites that require login credentials using proc http?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-scrape-websites-that-require-login-credentials-using-proc/m-p/763863#M241921</link>
      <description>Thanks a lot for your help</description>
      <pubDate>Wed, 25 Aug 2021 14:26:27 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-scrape-websites-that-require-login-credentials-using-proc/m-p/763863#M241921</guid>
      <dc:creator>kaziumair</dc:creator>
      <dc:date>2021-08-25T14:26:27Z</dc:date>
    </item>
  </channel>
</rss>

