BookmarkSubscribeRSS Feed
RyanJB
Obsidian | Level 7

Hi, I'm trying to scrape some data via PROC HTTP from a page that requires authentication, but am stuck at the login phase, where I get a 403 response for a CSRF error:

CSRF verification failed. Request aborted. You are seeing this message because this HTTPS site requires a Referer header to be sent by your web browser but none was sent ... ...

I'm very much a novice to this, so hopefully it is an easy fix. Below is my code plus the output from PROC HTTP debug:

filename src temp;
proc http
	method = "POST" 
	url = "https://www.awebsite.com/accounts/login/?next=/accounts/login/"
	out = src
	webusername = "user"
	webpassword = "pass";
	headers "Connection"="keep-alive";
	debug level=2;
run;

data test; *from a SAS blog post;
	infile src length = len lrecl = 32767;
	input line $varying32767. len;
		line = strip(line);
	if len > 0;
run;

And the debug output from my log:

> POST /accounts/login/?next=/accounts/login/ HTTP/1.1
> User-Agent: SAS/9
> Host: www.awebsite.com
> Accept: */*
> Content-Length: 0
> Cookie: csrftoken=oMSSRKxD3Z5e58DWDhd8cMhPkZ8IHqBFePBaZycwyAzxD3jFVKG2ZG8LFnUMRQyh
> Connection: keep-alive
> Content-Type: application/x-www-form-urlencoded
> 
< HTTP/1.1 403 Forbidden
< Date: Fri, 05 Nov 2021 15:11:34 GMT
< Server: Apache/2.4.34 (Red Hat) OpenSSL/1.0.2k-fips mod_wsgi/4.6.8 Python/3.8
< X-Frame-Options: SAMEORIGIN
< Content-Length: 1889
< Keep-Alive: timeout=5, max=100
< Connection: Keep-Alive
< Content-Type: text/html; charset=UTF-8

I'm not finding much on the CSRF error for SAS, mainly discussions on other platforms. 

 

I assume with the keep-alive connection, once I'm able to successfully authenticate then I'll be able to pull the data from other pages, so this appears to be my main roadblock.

 

Thanks for any help!

3 REPLIES 3
RyanJB
Obsidian | Level 7

I assume it has to do with the HTTPS call, and https://support.sas.com/documentation/cdl/en/proc/61895/HTML/default/viewer.htm#a003286920.htm has some instructions on how to deal with that, but I just don't really understand how to implement them.

 

Edit to add: This site (https://medium.com/@codebyamir/the-java-developers-guide-to-ssl-certificates-b78142b3a0fc) also seems to offer some hints in where the JRE certificates may be stored as well as the default password, but pasting that path/password into the code from the SAS support doc above and running from the Windows command line does not give any change to the site's response.

 

I also tried manually editing SASV9.cfg to add in the certificate path and password, and still get no change in the page's response.

ChrisHemedinger
Community Manager

If this site requires authentication then your approach might depend on a couple of things.

 

First, it might be simple and require a cookie-based session like this example.

 

But it could be more complex. If by logging into the site you receive a token, you may need to pass that token in the URL or headers for any subsequent calls.

 

A quick search on the url pattern in your example reveals it may be a site powered using the Django framework, a Python-language site builder/API framework. If the site offers an API or perhaps serves up data that you can get to in another way (database or another API closer to the source), then that might be a more fruitful approach.

 

Otherwise you're left using PROC HTTP to mimic the interactions of a browser, logging in and capturing session data that can be passed into subsequent calls. Sometimes you can learn more by opening your browser developer console and observing the Network tab, so you can see where browser-to-site network calls go as the site serves up data.

Check out SAS Innovate on-demand content! Watch the main stage sessions, keynotes, and over 20 technical breakout sessions!
RyanJB
Obsidian | Level 7
Thanks, that link did send me in a somewhat helpful direction; it at least got me to new errors on the page (it had been saying there was an error in the Header file, but adding more information from the header file sent from a browser changed the error to a bad CSRF cookie).

I very much wish they had a standard API or data download option, but they do not.

I may keep poking at it, but it's not looking great.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 913 views
  • 0 likes
  • 2 in conversation