DATA Step, Macro, Functions and more

Accessing a URL with the ZIP engine

Accepted Solution Solved
Reply
Respected Advisor
Posts: 4,934
Accepted Solution

Accessing a URL with the ZIP engine

Hi All,

Can I access the contents of this file (a .csv) :

http://donnees.ville.montreal.qc.ca/storage/f/2013-10-05T17%3A49%3A08.194Z/l2p-801-15052013-leplatea...

for example, with the ZIP engine directly, without downloading it first?

I'm using SAS 9.4 on Windows 7.

PG

PG

Accepted Solutions
Solution
‎02-04-2015 02:52 PM
Trusted Advisor
Posts: 1,301

Re: Accessing a URL with the ZIP engine

It would depend on the specific situation, but 's recommendation would be on the list:

Use PROC HTTP to download the file, probably to a TEMP file if I didn't care to keep it.

Use the FILENAME statement with the ZIP access method to gain access to the internal file

View solution in original post


All Replies
Super Contributor
Super Contributor
Posts: 3,174

Re: Accessing a URL with the ZIP engine

The contents of the file (referenced by the URL) is a zip-archive container, with a "CSV" file inside, which must be processed after retrieving from the foreign/remote site.  Explore SAS 9.4 with the FILENAME ZIP engine, possibly, after retrieving the container file, using SAS FILENAME, PROC DOWNLOAD, possibly.

Scott Barry

SBBWorks, Inc.

Super Contributor
Super Contributor
Posts: 3,174

Re: Accessing a URL with the ZIP engine

Recommend native SAS function/facility here with FILENAME / FTP engine to retrieve the file, followed by SAS ODS PACKAGE to extract from the zip-archive.

Trusted Advisor
Posts: 1,301

Re: Accessing a URL with the ZIP engine

You can do it with a pipe (which will be system dependent).  The following would be an example for linux.

filename x pipe 'wget -qO- http://donnees.ville.montreal.qc.ca/storage/f/2013-10-05T17%3A49%3A08.194Z/l2p-801-15052013-leplatea... | funzip';

Respected Advisor
Posts: 4,934

Re: Accessing a URL with the ZIP engine

Thank you both.

I was naively hoping to wrap the URL into a Windows UNC, somehow, and use that as the 'external-file' physical name in the FILENAME statement. That way I could access the contents of the zip archive directly. Just a dream...

PG

PG
Trusted Advisor
Posts: 1,301

Re: Accessing a URL with the ZIP engine

In windows you could try something like the following:

filename x pipe 'cmd /C "pushd \\myshare\folder\unc && C:\PROGRA~1\7-Zip\7z.exe x -so foo.zip 2> _garbage.txt && del _garbage.txt && popd"';

Respected Advisor
Posts: 4,934

Re: Accessing a URL with the ZIP engine

Thank you Matt,

I cannot use your suggestion unless I understand what it does. Would you please explain the OS commands feeding that pipe? Can foo.zip be a URL?

PG

PG
Trusted Advisor
Posts: 1,301

Re: Accessing a URL with the ZIP engine

You said you are wanting to use UNC path, not URL, very different.

cmd /c

- starts a new instance of the command interpreter and carries out the following command string and stops

http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/cmd.mspx?mfr=true

pushd \\myshare\folder\unc

- assigns a UNC path a temporary drive letter and changes directory to this location

https://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/pushd.mspx?mfr=true

C:\PROGRA~1\7-Zip\7z.exe x -so foo.zip 2> _garbage.txt

- 7-zip is a open source file archiver that has the ability to output the contents of a ZIP archive to STDOUT

7-Zip

Where:

     x - denotes to eXtract from the archive

     -so - write contents of archive to STDOUT

     foo.zip - is the name of the file contained in the UNC path assigned with PUSHD

     2> _garbage.txt - this redirects the other messages produced by 7z.exe to a junk file, redirecting to NUL doesn't work, for some reason

del _garbage.txt

- Delete the file we had to create with 7z.exe to get the STDOUT write to work

https://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/del.mspx?mfr=true

popd

- Removes the temporary drive-letter assignment created with pushd and returns to the original working directory

https://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/popd.mspx?mfr=true

&&

The double ampersand (&&) command separator strings all these commands together, with error checking (only running one after the previous runs without error

http://support.microsoft.com/kb/279253

If you are wanting to use a proper URL in Windows it would be a different set of commands and different tools.

Trusted Advisor
Posts: 1,301

Re: Accessing a URL with the ZIP engine

If you want to use URL then:

Get Wget for Windows

Get bsdtar for Windows with LibArchive for Windows

filename x pipe 'wget -qO- http://donnees.ville.montreal.qc.ca/storage/f/2013-10-05T17%3A49%3A08.194Z/l2p-801-15052013-leplatea... | bsdtar -xvf-';


Where:

wget downloads the zipfile

-q sets the wget command to be quiet

-O- sets the output to be directed to STDOUT

bsdtar reads the zip file from the pipe and extracts the contents, again to STDOUT

-x extract

-v tells bsdtar to operate verbosely

-f- indicates to operate on the file coming from STDIN

Respected Advisor
Posts: 4,934

Re: Accessing a URL with the ZIP engine

Thank you Matt, this should be very helpful to me and hopefully to others too!

PG

PG
Trusted Advisor
Posts: 1,301

Re: Accessing a URL with the ZIP engine

It's not something I'd really recommend, but's it certainly possible to do.

Respected Advisor
Posts: 4,934

Re: Accessing a URL with the ZIP engine

I have to ask! What would you really recommend? - PG

PG
Solution
‎02-04-2015 02:52 PM
Trusted Advisor
Posts: 1,301

Re: Accessing a URL with the ZIP engine

It would depend on the specific situation, but 's recommendation would be on the list:

Use PROC HTTP to download the file, probably to a TEMP file if I didn't care to keep it.

Use the FILENAME statement with the ZIP access method to gain access to the internal file

Respected Advisor
Posts: 4,934

Re: Accessing a URL with the ZIP engine

I'll try that first then. Thanks again Matt. - PG

PG
Super User
Posts: 10,044

Re: Accessing a URL with the ZIP engine

PG,

This could not possibly be done in one step .

firstly proc http + TEMP engine download this ZIP file ,then use filename's ZIP engine.

Xia Keshan

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 15 replies
  • 488 views
  • 6 likes
  • 4 in conversation