It seems like an installation issue with your gzip under windows 7 then...
http://gnuwin32.sourceforge.net/packages/gzip.htm
You may also try using a different utility such as 7z, which can also work with gzip files through command line and with pipes.
You may be able to uncompress, on the fly, by piping the file in your filename statement. e.g., take a look at: http://www.ats.ucla.edu/stat/sas/faq/readgz.htm
We used the same piping syntax in filename stmt for five years through thousands of programs and through two changes in operating systems. It stopped working after changing to Windows 7.
It seems like an installation issue with your gzip under windows 7 then...
http://gnuwin32.sourceforge.net/packages/gzip.htm
You may also try using a different utility such as 7z, which can also work with gzip files through command line and with pipes.
Hm, the data comes as GZIP, it will be very cumbersome to convert all the files to a different utility and ask for the input in a different format.
Not a different format. Just a different utility program on you PC to read the same format.
You haven't told us your complete environment, so these comments may not be applicable.
On Win XP (at least), piping the GZIP command produces a temporary (hidden) copy of the uncompressed data, which takes up a lot of space and disk I/O (this does not happen on *nix systems).
Windows compression gives nearly as good a compression on raw data files as GZIP.
The two of these together means that using Windows compression (and skipping the pipe) is more efficient in Windows.
------------
Caveat: I discovered the first one the hard way several years ago and can't currently remember which compression utility I was using. However, my gut feeling is that it is a Window's "feature" rather than part of gzip.
Message was edited by: Lawrence Muhlbaier
Thanks, Doc@Duke: I may try Windows compression directly as a separate test project, but many of my input files reside on a non-Windows server. I have to copy them to my local drive.
SAS Tech support claims it is not SAS's fault. It must be the environment or the GZIP issue.
Here is the basic code.
%let File1="e:\John\ProductFiles\Product.out.20120115.gz";
%let dir=C:\"Program Files"\GnuWin32\bin\gzip;
data Product (keep=var1 var2 var3 var4);
filename file2 pipe %unquote(%str(%'&dir -cd &file1%'));
infile file2 DSD missover;
length var1 $12 var2 $14 var3 $3 var4 $20;
input var1 var2 var3 var4;
run;
Try setting up your command prompt enviornment first by doing the following:
x set path=%nrstr(C:\Program Files\GnuWin32\bin;%PATH%);
*you may want to set the following to an alternate path as they are where the temporary files are written;
*by default this location is in your homedirectory, I beleive;
x set tmp=%TEMP%;
x set tmpdir=%TEMP%;
%let file1=E:\John\ProductFiles\Product.out.20120115.gz;
filename file2 pipe "gzip -cd &file1";
If you still experience issues in SAS try performing the same commands in windows command prompt and see if you experience similar issues or get an error message of some kind.
I doubt it is the issue, but tou should move your FILENAME statement to BEFORE the data step that is trying to use it. It is a global statement and not one that executes inside the data step.
You can simplify your FILENAME statement by using the QUOTE function rather than the macro quoting functions.
QUOTE will surround the string in dquote (") characters and double up any dquote (") characters in your generated command so that they are passed properly to the operating system.
filename file2 pipe %sysfunc(quote(&dir -cd &file1));
Try moving the quotes in your DIR (atually command)nacro variable to the front and back instead of just around Program Files part of the path.
Open a command window on your PC and run the same command to see what it does. Try piping it to more so you can scroll through the file.
> "c:\Program Files\Gnu\Win32\bin\gzip" -cd "e:\John\ProductFiles\Product.out.20120115.gz" | more
Thank you for all your input. I really appreciate all of your answers. Since the problem occurred not on one of SAS PCs at my location, I decided to start from scratch by loading GZIP (thanks art297 and FriedEgg!) on my Win 7 machine with the same SAS installation, but no prior history of GZIP. Much to my surprise I succeeded to read a comma delimited file compressed by GZIP with no problems. So we will have to check the other PC regarding both the SAS and GZIP install.
Comments to some hints fromTom:
1) Using double quotes around the %dir path and keeping embedded space in Program Files yielded an error.
NOTE: The infile FILE2 is:
Unnamed Pipe Access Device,
PROCESS="C:\Program Files\GnuWin32\bin\gzip" -cd "c:\curr_avail.txt.gz",
RECFM=V,LRECL=256
Stderr output:
'C:\Program' is not recognized as an internal or external command,
operable program or batch file.
2) Moving the filename statement to above the data step did not make any difference, though I agree it is a global statement and should be moved up.
3) Both %unquote and %sysfunc yielded correct results.
Since I cannot replicate the general failure we'll have to test many scenarios, as FriedEgg, Doc@Duc, and Tom suggested. Unfortunately the PC in question is at a remote site for me, so it may take longer than my own fixes.
My apology! It turned out the PC needed a reboot after the installation of GZIP (duh!), which I did perform, but the other user did not, since the GZIP installation did not prompt for it.
Well, at least we now know that it works.
Now I have a follow up question: is it possible to read in SAS a GZIP compressed FOLDER with several text comma delimited files? If so, what would be the syntax?
Actually, when I decompress one of my standard GZ files I get two folders: one with just one file, and another one with multiple files.
GZIP files cannot contain folder structures. This must be a different type of archived file.
The filename syntax is filename.tar.gz - I assume it is GZIP. I gather the file inside is the result of applying tar archive to bundle several files, while preserving the file system. When I decompress it, I get two folders.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.