BookmarkSubscribeRSS Feed

How do I read and write ZIP files in SAS? Q&A, Slides, and On-Demand Recording

Started ‎04-16-2025 by
Modified ‎04-16-2025 by
Views 433

“Zip, zip, hooray!” is what you’ll be saying when you learn how to read and create ZIP files – just like that.  Did you know that SAS has built-in capabilities that allow you to work with ZIP files in your SAS programs?  We’ll share the methods that SAS offers when working with these file types.

 

The webinar covers how to:

  • Use the FILENAME ZIP method to read and write content in ZIP files.
  • Discover what files and folders exist within a ZIP archive and read data files directly within the DATA step.
  • Read and create gzip (GZ) files, a special form of compressed file typically used on Unix file systems.
  • Consider limitations and cautions when using ZIP files in SAS.

Watch the on-demand webinar

Q & A

What is CRC-32?

CRC 32 is the checksum-like method for for basically making sure that what was compressed and added in is is is is exactly the same as what gets uncompressed. It's a the. The CRC32 method is a basically a little mathematical algorithm that allows you to make sure that the integrity is is maintained when you are copying one thing from one kind of thing to another. 

 

How to start working with the zip file if you don't know the structure of the zip file?

Some of the methods that I showed, you can use SAS code to list or basically create a data set with the members that are in that zip file. So in the same way that you might discover what's in a file directory that you don't know what's in there yet.

 

For example, you have a SAS job where you need to discover and process every single text file that's in this folder. You don't know ahead of time what the text files are. You can use the methods I showed (DOPEN and DREAD functions) in the same way that you might do that with a whole directory; you could do that with a with a zip file and pull out those member names and the structure and then iterate through each one in subsequent steps to process it as you would.

 

You had an example that added two files to one ZIP in two different data steps. Could they have been added in one step?

Yes, they could have been. You can use the FCOPY function multiple times in a single DATA step, But remember that you need separate filerefs for each of the sources and destinations, right? So in my example,  I assigned the filerefs and then and then had the DATA step that did the copy and then cleared those filerefs and reassigned them and had another second data step that did the second copy. But you could have it all on one DATA step. And you can also use DATA step functions to manage filerefs. This can work with zip files too. So. So yes, you can do it all in a single data step and get quite complex with it.

 

Good morning, Is there a similar solution for WinRaR files? Or how to work them in SAS? Thank you for the presentation.

I don't think so. WinRaR is one of those other compression types that is not supported by FILENAME ZIP; as far as I know it's only zip files and GZ files.

 

is there a file size limitation?

a. Not that's imposed by the FILENAME ZIP method. It is. It's possible that, you know, just on disk you might have file size limitations in terms of the size of a file that you can create. 

 

I have nested folders of data, with several layers of folders. Can this be zipped using SAS?

Yes. As I showed in some of my examples, if you have a folder structure that you want to maintain when you're creating your zip file, you just include that as part of the MEMBER= option. In my examples, I had a data subfolder and then the file names. For example:

 

/* address a file you want to create in a /data subfolder in the ZIP file */
filename dest ZIP "path-to-zip-file.zip" member="data/myfile.txt";

 

Similar when reading them out, you'll get a list of all the members and they will have all the the full path, the full relative path within that zip file. And then if you want to maintain that path, structure when you're copying the files out, you would have to use different functions in SAS to create those folders. First create that, basically create that folder structure and then and then use that as a destination of where to copy those files to out of your zip file.

 

When unzipping a file with subfolders, do they automtically get created if they don't exist? Also, if the Member file already exists, will it automtically get overwritten when unzipping?

The first part of that question, when you're unzipping or copying out, you have to create those directory structures; SAS won't create them for you. So you have to create that and then explicitly copy the files to those folders. You have to do the work to figure out, OK, this file is in this these subfolders. I'm going to put it into this, this directory structure that I've created to mirror it. So that's on you. And the second part of the question, would it overwrite? Yes, it will overwrite.

 

What resources would you recommend to help with sending a zip file to a secure ftp location using an rsa key?

Secure FTP. So that's a different file name method. There's a FILENAME SFTP,  It's a little bit more work to set up that protocol because it leverages some operating system tools to do the work. So that's a 2 step process. Use filename zip to create the zip file to to begin with, and then you're going to use a different filename with SFTP and the whole protocol in order to be able to publish it out. 

 

Are there any advantages to using FILENAME ZIP besides the obvious one of using less memory space?

I think the main advantage is that you don't, you are not having to call out to external tools. For many years we didn't have file name zip inside SAS. It was added in 9-4, but I mean it's been in there for like 8 or 9 years at this point. But we still have a lot of SAS code out there in the world and examples that show using external tools and call, which which a lot of SAS users aren't able to do because they're running environments where they don't have access to operating system commands in that way. So this allows you to to get keep it all within SAS code and make it portable. So it doesn't matter what kind of operating system you're running on. It works in SAS 9, works in BIA, doesn't matter the the tools that you're using at all works the same, and you don't have to have any elevated privileges that administrators are often reluctant to grant to call external tools.

 

[Several questions about encryption and password-protected ZIP files] How do you handle reading and writing zip files with passwords? Does ODS Package have a password option?

None of the ZIP methods in SAS (FILENAME ZIP, ODS Package, etc) support encryption methods or password-protected ZIP files. While these techniques are not usually regarded as robust methods to protect data, some organizations still rely on them. For these, you still have to use external tools from SAS in order to make these work.

 

Will .gz files zipped using gzip from AIX server be ready using this ?

Yeah, I think so. I don't see why not.

 

Can Fed employees use FILENAME to access Azure BLOBs to get data in and out of SAS Viya?

a. (Not really a ZIP question, but we were talking about FILENAME statement in general...) SAS supports different file name methods, including to access cloud storage (example: FILENAME S3). There's a whole level of of sophistication when we're accessing cloud storage that requires some some kind of authentication or permission in order to get to that cloud. And so there's always going to be some setup for that. I don't know what special considerations federal employees might have. I know that governments use Azure in a special environment that's different than the rest of us. This would require more more research.

 

I have to read un gz file with a sas7dbat table on it built on SAS Enterprise Guide(latin9 encoding) with x command but now when i tried to read it on SAS studio on SAS Viya with Latin9 encoding by using filename zip gzip, I can't. I have a pop up which saying that maybe the SAS table is corrupted or damaged.  To correct that, i deal with it by gunzipping the file on Linux. Can you tell me a solution to use ?

I'm not sure if that's an expected limitation or not, as I did mention that there sometimes can be some encoding complications, but in general you know a zip file is binary and encoding is usually not an issue. However, a .sas7bdat file encoding is specific to the operating system in which where you created it. But that said, SAS sessions running a different encoding should be able use CEDA to at least read the file. It may be a question you need to work out with Technical Support just to try and figure out why that is and whether anything can be done about it.

 

Can SAS upzip .tar files? (Unix)

No, these methods do not support tar files. See this topic: Separating .tar files in SAS 

 

Version history
Last update:
‎04-16-2025 11:32 AM
Updated by:
Contributors

sas-innovate-white.png

Missed SAS Innovate in Orlando?

Catch the best of SAS Innovate 2025 — anytime, anywhere. Stream powerful keynotes, real-world demos, and game-changing insights from the world’s leading data and AI minds.

 

Register now

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Article Labels
Article Tags