BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
LzEr23
Obsidian | Level 7

Hello.

I am a newbie to SAS programming and I need a help on unzipping the gz and zip files using SAS.

 

I have a .zip file which contains several .csv.gz files within it.

I could find how to unzip .zip files and how to open .gz files, but it seems quite complicated to do both at once.

what I need is to create a new unzipped files of .csv files. (or, just reading as datasets would be helpful too)

1 ACCEPTED SOLUTION

Accepted Solutions
ChrisHemedinger
Community Manager

SAS 9.4 has a native capability to read ZIP files using FILENAME ZIP.  And it if you have SAS 9.4 Maint 5, the method also supports GZIP files.

 

Because you have a ZIP of GZ files, you're looking at a two step process.  First, expand the members of the ZIP file with FILENAME ZIP.  I have some examples in this blog post.  Note that the process involves "reading" each ZIP member and writing it as a new file to a temporary space -- effectively copying it out of the ZIP archive.  At the end of this step, you would have a collection of *.csv.gz files in your temp space.

 

You would then use FILENAME ZIP GZIP to reference each of those csv.gz files in turn.  You don't need to explicitly decompress those -- once you assign that fileref, you should be able to process the file with DATA step.  See this blog post about reading GZIP files.

Learn from the Experts! Check out the huge catalog of free sessions in the Ask the Expert webinar series.

View solution in original post

5 REPLIES 5
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Why do you have two levels of compression in the first place? Sounds like a zip from linux has then been moved over to windows, where someone else has then zipped them, bit daft.

Anyways, most zip programs, winzip, 7zip should be able to deal with both file types.  So you would need to:

1) Get filenames

2) Unzip each filename using command line extract

3) Get list of files

4) Unzip each filename using command line extract

 

A shell of a program might look like:

filename tmp pipe 'dir "c:/data/*.zip" /b';

data _null_;
  infile tmp;
  length nm $200;
  input nm $;
  call execute(cat('x "c:/programfiles/7zip/7zip.exe -e "',strip(nm),'";'));
run;

filename tmp pipe 'dir "c:/data/*.gz" /b';

data _null_;
  infile tmp;
  length nm $200;
  input nm $;
  call execute(cat('x "c:/programfiles/7zip/7zip.exe -e "',strip(nm),'";'));
run;

Question is, is it worth coding this?  You can select all files, right click and select extract to /* with 7zip for instance, its not really a huge effort.

ChrisHemedinger
Community Manager

SAS 9.4 has a native capability to read ZIP files using FILENAME ZIP.  And it if you have SAS 9.4 Maint 5, the method also supports GZIP files.

 

Because you have a ZIP of GZ files, you're looking at a two step process.  First, expand the members of the ZIP file with FILENAME ZIP.  I have some examples in this blog post.  Note that the process involves "reading" each ZIP member and writing it as a new file to a temporary space -- effectively copying it out of the ZIP archive.  At the end of this step, you would have a collection of *.csv.gz files in your temp space.

 

You would then use FILENAME ZIP GZIP to reference each of those csv.gz files in turn.  You don't need to explicitly decompress those -- once you assign that fileref, you should be able to process the file with DATA step.  See this blog post about reading GZIP files.

Learn from the Experts! Check out the huge catalog of free sessions in the Ask the Expert webinar series.
LzEr23
Obsidian | Level 7
thank you so much for the help.

I can see that I need to first go through a unzipping process with filename, and when end up with bunch of csv.gz temp files, then I should move on to deal with gzip files.

But unfortunately, the only available sas that I have is sas 9.4 mount 3. And apparently, gzip option does not seem to work on this version of sas.

It has already been a great help, but can I ask for more advice on csv.gz files when I'm not using sas 9.4 mount 5?
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Then your option is to a) use x commands to shell the commands out to the operating system as I present above, or do it outside of SAS.  No reason why you cannot do it via normal batch file:

https://stackoverflow.com/questions/17077964/windows-batch-script-to-unzip-files-in-a-directory

 

Not all processing needs to be done in SAS.

error_prone
Barite | Level 11
Bad news: if xcmd is disabled in your sas session, you can't unzip gz-files during sas execution. With xcmd enabled you can execute 7z, for example, from your sas session.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 48819 views
  • 6 likes
  • 4 in conversation