BookmarkSubscribeRSS Feed

Using FILENAME URL to Access Internet Information

Started 3 weeks ago by
Modified 3 weeks ago by
Views 1,513

MarkJordan.jpgIntroduction

 

If SAS is the ultimate multitool for working with data, then the FILENAME statement is one of my favorite “blades”. You can use a FILENAME statement to read from and write to files and directories in the local file system, items in a SAS catalog, members of a ZIP archive, files on an FTP server, file service items in SAS Viya, and even the system clipboard. You can access HDFS files in Hadoop, Azure files including Blob Storage, and Google Cloud Storage. You can even have your SAS program email results to you or others using FILENAME. But today, I want to discuss one of my favorite uses: using the FILENAME statement to access data from URLs on the internet.

 

With FILENAME URL, you can directly read from static web pages, a technique I’ve often used for web scraping in SAS. But the real power lies in executing API GET requests and processing the responses. FILENAME URL uses the HTTP protocol to access resources on the internet, just like PROC HTTP. While PROC HTTP provides significantly more flexibility than FILENAME URL, the code required is also more complex and requires you to create a local copy of the requested resource. With FILENAME URL, you directly read the internet resource in your SAS code without the need for a local copy. In the following examples, we’ll use URLs that access httpbin.org, a free and simple service that allows you to test HTTP requests and responses to demonstrate various options.

 

The Basics

 

First, let’s get a response to a simple GET request, with no authentication required.

 

/* Basic GET from a website that requires no authentication */
filename in url "https://httpbin.org/get"; /* URL to access */
data _null_;
	infile in;
	input;
 	put _infile_;
run;
filename in clear;

 

In the log, we can see some very useful information. The note after the data step contains information about my local machine and the httpbin.org server:

 

01_output_MJ_Screenshot 2025-02-18 115646.png

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

 

The DATA _NULL_ step also printed the body of the response from the server in the log. We cans see here that information the server provides a JSON response by default:

 

02_output_MJ_Screenshot 2025-02-18 115930.png

 

We might want to specify that we only want JSON as the response format. We can do that by using the ACCEPT= option to modify the Accept value of the request header:

 

/* Modify the ACCEPT header to specify request a JSON */
filename in url "https://httpbin.org/get" accept="application/json"; /* URL to access */
data _null_;
	infile in;
	input;
	put _infile_;
run;

 

The log shows pretty much the same JSON as was returned by the previous request. Now that we are sure the result will be JSON, we can use the JSON LIBNAME engine to read the response instead of the DATA _NULL_ step, allowing us to access the response as if it was a SAS library.

 

/* A LIBNAME statement with the JSON fileref and JSON engine */
libname in json;
title "Alldata";
proc print data=in.alldata;run;title;
libname in clear;
filename in clear;

 

 01_MJ_Jedi_2025_FILENAME_URL_01.png 

 

Capturing Response Headers

 

If you wish to capture the response headers, you can create a filref to a text file and specify that fileref in the HEADERS= option like this:

 

/* Capturing the response headers */
/* Create the header text file */
filename head temp; 
/* Use the text file fileref in headers= */
filename in url "https://httpbin.org/xml" headers=head; 

data _null_;
	infile in;
	input;
	put _infile_;
run;
/* After read from the URL, the headers have been captured */
data _null_;
	infile head;
	input;
	put _infile_;
run;
filename head clear;
filename in clear;

 

The second DATA _NULL_ step writes the header information to the log:

 

03_output_MJ_Screenshot 2025-02-18 120044.png

 

Authentication

 

Basic UserID and Password

 

This is the simplest authentication method to implement and use, but it’s quite insecure. The UserID and password are included in the SAS program in clear text, and will be sent over the network as clear text, too (unless the URL begins with https). You should never use this method in production code.

 

/* UserID and Password - plain text */
filename in url "https://httpbin.org/basic-auth/myID/1TrickPony" 
   user="myID"        /* username */
   pass="1TrickPony"  /* password */
;

 

The log reveals our authentication was successful:

 

04_output_MJ_Screenshot 2025-02-18 120143.png

 

UserID and SAS-Encoded Password

 

A slightly more secure method is to use PROC PWENCODE to encode the password. This way the password does not appear as clear text in the code. You do this before using FILENAME URL to access the internet resource. In this program, I encode my password and place the encoded value in a macro variable for later use:

 

/* Encode the password to a temp file */
filename p temp;
proc pwencode in="1TrickPony"
  method=sas002 out=p;
run;

/* Put the encoded password in a macro variable */
data _null_;
	infile p;
	input;
	call symputx('EncodedPW',_infile_);
run;
filename p clear;

 

Now, I can resolve the macro variable in my code to provide the encoded password:

 

filename in url "https://httpbin.org/basic-auth/myID/1TrickPony" 
   user="myID"        /* username */
   pass="&EncodedPW"  /* encoded password */
;

data _null_;
	infile in;
	input;
	put _infile_;
run;

filename in clear;

 

The log reveals that, once again, our authentication was successful:

 

04_output_MJ_Screenshot 2025-02-18 120143.png

 

Prompt for UserID/Password (SAS Windowing Environment only)

 

If you are using the SAS Windowing Environment, you can avoid hard-coding your userID and password by using the PROMPT option. As you try to access the resource, the option opens a dialog window prompting you to enter your userID and password:

 

/* Prompt for UserID and Password - only works in SAS Windowing environment */
filename in url "https://httpbin.org/basic-auth/myID/1TrickPony" 
   prompt
;

/* When prompted, userID is myID and pw is 1TrickPony. 
   Leave Proxy ID and Proxy PW blank                   */
data _null_;
	infile in;
	input;
	put _infile_;
run;
filename in clear;

 

When the data step runs, you are prompted for your UserID:

 

02_MJ_Jedi_2025_FILENAME_URL_02a_Userid.png

 

And then for your password:

 

03_MJ_Jedi_2025_FILENAME_URL_02b_Password.png

 

Next you are prompted for a Proxy UserID and password – we’ll leave both of those blank:

 

04_MJ_Jedi_2025_FILENAME_URL_02c_Proxy.png 

 

The log once again shows successful authentication:

 

05_output_MJ_Screenshot 2025-02-18 120230.png

 

Proxy Server Authentication

 

If you must separately authenticate to a proxy server, you can use the PROXY= option to specify the server, and the PUser= and PPass= options to specify the proxy server userID and password, respectively. Don’t forget the CONNECT option to make the proxy connection. I don’t have a free proxy server that requires authentication for you to use, so this code won’t actually run, but it’s a good template for future use:

 

filename in url "https://httpbin.org/get"
   PROXY="my.proxyserver.com"
   PUSER="myProxyID"
   PPass="myProxyPassword"  
   CONNECT;
run;

 

Debugging

 

Adding the DEBUG option will cause lots of informational notes to be written to the SAS log. In this example, our request fails, but from the log, it’s not so obvious why:

 

filename in url "https://httpbin.org/get/badfile.html";  
data _null_;
	infile in;
	input;
 	put _infile_;
run;
filename in clear;

 

Log:

 

07_output_MJ_Screenshot 2025-02-18 120419.png

 

Let’s try that again using the DEBUG option:

 

filename in url "https://httpbin.org/get/badfile.html" debug; 
data _null_;
	infile in;
	input;
 	put _infile_;
run;
filename in clear;

 

Log:

 

08_output_MJ_Screenshot 2025-02-18 120511.png

 

Notice the 8th NOTE: from the top states “404 NOT FOUND” – this indicates we’ve asked for a resource from the server that does not exist.

 

Here’s a different dilemma:

 

filename in url "https://httpbin.org/basic-auth/myID/1TrickPony" 
   user="myID"        /* username */
   pass="CircusPony"  /* bad password */;

data _null_;
	infile in;
	input;
	put _infile_;
run;
filename in clear;

 

Log:

 

09_output_MJ_Screenshot 2025-02-18 120557.png

 

This looks like a pretty straightforward authentication issue. If we add the DEBUG option to this code, the additional log content merely reinforces that this is an authentication issue:

 

10_output_MJ_Screenshot 2025-02-18 120641.png

 

Real-World Application: World Bank API

 

For a more practical example, I’ll use FILENAME URL to execute a GET request to the World Bank API, then use the JSON LIBNAME engine to process the response in SAS. The query in the code will return a JSON response which includes the 3 letter ISO country code for each country, and the country’s population as of January 2020. I’ll use the JSON libname engine to directly read the JSON response, and use PROC SQL to join the API response ROOT table to a SAS data set containing the 3 letter ISO country code and country names to create a SAS data set containing the country name and population for all countries with populations greater than one hundred million. I’ll finish up by using PROC SQL and PROC SGPLOT to make a nicely laid out report from the final table. Here's the code:

 

/* Set up the FILENAMES used to retrieve the World Bank data */
filename in url  'https://api.worldbank.org/v2/country/all/indicator/SP.POP.TOTL?date=2020&format=json&per_page=5000';
filename jmap temp;
libname in json map=jmap automap=replace ;

/* Join the the World Bank data to the SAS table to create the Country_pop_2020 data set */
proc sql;
create table Country_pop_2020 as
select Country
      ,code
      ,value format=comma15. as Population
   from in.root as r
		inner join
		  countries as c
	on r.countryiso3code=c.Code
	order by value desc
;
quit;

/* Don't need the World Bank connection anymore. Clear librefs and filerefs */
libname in clear;
filename in clear;
filename jmap clear;

/* Create the report based on our new table */
title "Countries with Populations >100,000,000";
ods layout gridded width=6.5in columns=2;
ods region;
proc sql;
select Country, Code, Population
   from Country_pop_2020
	where Population gt 100000000
   order by Population desc
;
quit;
ods region;
proc sgplot data=country_pop_2020;
   vbar Country / response=Population categoryorder = respdesc;
	where Population gt 100000000;
run;
ods layout end;
title;

 

And here’s the report it generates:

 

05_Jedi_2025_FILENAME_URL_03_WorldBankReport.png

 

I hope you found this article interesting and useful. I’m a bit curious, though. If you read this to the end, something must have piqued your interest in FILENAME URL. What brought you here - was it a problem you were trying to solve? If so, did you find enough information here to get you started?

MarkJordan.jpg


Until next time, may the SAS be with you!

Mark

 

PS: You can download a ZIP file containing a PDF of this post and the SAS code used to create it, including code to create the Countries SAS data set required for the join. GEt it at https://bit.ly/SASJediFilenameURL

Comments

SAS PAckages Framework's packages installation process (those from SASPAC archive and those from other locations) is based on the URL file access!

 

@yabwon I was not aware - but that's pretty cool! I've used it a lot to pull in macros from my Github stash when working on an unfamiliar system - makes for a quick & easy %INCLUDE. 

@SASJedi We can chat about this during our "both next conference". And about wrapping your macros library into a package too! 🙂

I like coding LIBNAMEs and FILENAMEs often. This BLOG note will make me try FILENAME URLs.  Thanks very much. I can try it out on my websites.

@jbilenas - so glad you found the article interesting! Let me know how it goes as you experiment with FILENAME URL 🙂

Version history
Last update:
3 weeks ago
Updated by:
Contributors

sas-innovate-white.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.

 

Early bird rate extended! Save $200 when you sign up by March 31.

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags