BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
LzEr23
Obsidian | Level 7

Hello.

I am a newbie to SAS programming and I need a help on unzipping the gz and zip files using SAS.

 

I have a .zip file which contains several .csv.gz files within it.

I could find how to unzip .zip files and how to open .gz files, but it seems quite complicated to do both at once.

what I need is to create a new unzipped files of .csv files. (or, just reading as datasets would be helpful too)

1 ACCEPTED SOLUTION

Accepted Solutions
ChrisHemedinger
Community Manager

SAS 9.4 has a native capability to read ZIP files using FILENAME ZIP.  And it if you have SAS 9.4 Maint 5, the method also supports GZIP files.

 

Because you have a ZIP of GZ files, you're looking at a two step process.  First, expand the members of the ZIP file with FILENAME ZIP.  I have some examples in this blog post.  Note that the process involves "reading" each ZIP member and writing it as a new file to a temporary space -- effectively copying it out of the ZIP archive.  At the end of this step, you would have a collection of *.csv.gz files in your temp space.

 

You would then use FILENAME ZIP GZIP to reference each of those csv.gz files in turn.  You don't need to explicitly decompress those -- once you assign that fileref, you should be able to process the file with DATA step.  See this blog post about reading GZIP files.

Check out SAS Innovate on-demand content! Watch the main stage sessions, keynotes, and over 20 technical breakout sessions!

View solution in original post

5 REPLIES 5
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Why do you have two levels of compression in the first place? Sounds like a zip from linux has then been moved over to windows, where someone else has then zipped them, bit daft.

Anyways, most zip programs, winzip, 7zip should be able to deal with both file types.  So you would need to:

1) Get filenames

2) Unzip each filename using command line extract

3) Get list of files

4) Unzip each filename using command line extract

 

A shell of a program might look like:

filename tmp pipe 'dir "c:/data/*.zip" /b';

data _null_;
  infile tmp;
  length nm $200;
  input nm $;
  call execute(cat('x "c:/programfiles/7zip/7zip.exe -e "',strip(nm),'";'));
run;

filename tmp pipe 'dir "c:/data/*.gz" /b';

data _null_;
  infile tmp;
  length nm $200;
  input nm $;
  call execute(cat('x "c:/programfiles/7zip/7zip.exe -e "',strip(nm),'";'));
run;

Question is, is it worth coding this?  You can select all files, right click and select extract to /* with 7zip for instance, its not really a huge effort.

ChrisHemedinger
Community Manager

SAS 9.4 has a native capability to read ZIP files using FILENAME ZIP.  And it if you have SAS 9.4 Maint 5, the method also supports GZIP files.

 

Because you have a ZIP of GZ files, you're looking at a two step process.  First, expand the members of the ZIP file with FILENAME ZIP.  I have some examples in this blog post.  Note that the process involves "reading" each ZIP member and writing it as a new file to a temporary space -- effectively copying it out of the ZIP archive.  At the end of this step, you would have a collection of *.csv.gz files in your temp space.

 

You would then use FILENAME ZIP GZIP to reference each of those csv.gz files in turn.  You don't need to explicitly decompress those -- once you assign that fileref, you should be able to process the file with DATA step.  See this blog post about reading GZIP files.

Check out SAS Innovate on-demand content! Watch the main stage sessions, keynotes, and over 20 technical breakout sessions!
LzEr23
Obsidian | Level 7
thank you so much for the help.

I can see that I need to first go through a unzipping process with filename, and when end up with bunch of csv.gz temp files, then I should move on to deal with gzip files.

But unfortunately, the only available sas that I have is sas 9.4 mount 3. And apparently, gzip option does not seem to work on this version of sas.

It has already been a great help, but can I ask for more advice on csv.gz files when I'm not using sas 9.4 mount 5?
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Then your option is to a) use x commands to shell the commands out to the operating system as I present above, or do it outside of SAS.  No reason why you cannot do it via normal batch file:

https://stackoverflow.com/questions/17077964/windows-batch-script-to-unzip-files-in-a-directory

 

Not all processing needs to be done in SAS.

error_prone
Barite | Level 11
Bad news: if xcmd is disabled in your sas session, you can't unzip gz-files during sas execution. With xcmd enabled you can execute 7z, for example, from your sas session.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 46422 views
  • 6 likes
  • 4 in conversation