Hello,
I received a request to review some reporting that my predecessor left behind years ago, and I'm not familiar at all with how to manage this. Within the "datamart" library, there is an item named "filename.sas7bdat.gz". To my understanding, this is a gzipped file which would contain some of the information that I need to view in order to (hopefully) retrace the steps used to build this report. My predecessor also left some code, as follows, which I've run but does not actually open any tables, or seem to do anything at all. Help!
/*Unzip SAS7BDAT File*/
x "gunzip &datamart/filename.sas7bdat";
run;
A gz file is a compressed single file, usually on a Unix/Linux system. In this case you have a SAS data set file that has been compressed with gzip.
gunzip is a tool that can uncompress it. You can also accomplish this directly in SAS code with the FILENAME ZIP method and GZIP option. See this article for background and examples.
Example:
/* The expands the GZ data to a WORK data set */
filename zipdata ZIP "&datamart/filename.sas7bdat.gz" GZIP;
filename unzip "%sysfunc(getoption(WORK))/filename.sas7bdat";
data _null_;
infile zipdata
lrecl=256 recfm=F length=length eof=eof unbuf;
file unzip lrecl=256 recfm=N;
input;
put _infile_ $varying256. length;
return;
eof:
stop;
run;
proc print data=work.filename(obs=5);
run;
A gz file is a compressed single file, usually on a Unix/Linux system. In this case you have a SAS data set file that has been compressed with gzip.
gunzip is a tool that can uncompress it. You can also accomplish this directly in SAS code with the FILENAME ZIP method and GZIP option. See this article for background and examples.
Example:
/* The expands the GZ data to a WORK data set */
filename zipdata ZIP "&datamart/filename.sas7bdat.gz" GZIP;
filename unzip "%sysfunc(getoption(WORK))/filename.sas7bdat";
data _null_;
infile zipdata
lrecl=256 recfm=F length=length eof=eof unbuf;
file unzip lrecl=256 recfm=N;
input;
put _infile_ $varying256. length;
return;
eof:
stop;
run;
proc print data=work.filename(obs=5);
run;
Thanks for the response. Here's what I did, where "Import" is my library that corresponds to "drivename". However, at the proc print step, I am getting the error message 'IMPORT.filename.data is shorter than expected use PROC DATASETS; REPAIR to fix it. What should I do to fix this?
filename zipdata ZIP "drivename/filename.sas7bdat.gz" GZIP;
filename unzip "drivename/filename.sas7bdat";
data _null_;
infile zipdata
lrecl=256 recfm=F length=length eof=eof unbuf;
file unzip lrecl=256 recfm=N;
input;
put _infile_ $varying256. length;
return;
eof:
stop;
run;
proc print data=import.filename(obs=5);
run;
Try actually using gunzip to expand the file and see if that works. If it does not work then the original gz file is corrupted.
As it says, use Proc datasets with Repair:
libname mylib "c:\testdata";
proc datasets lib=mylib;
repair myfile;
run;
quit;
In most cases, running a Proc Content on the library will also do the trick, but Proc Datasets is the secure way.
@ChrisHemedinger Are all the options you added to data _null_ statement (e.g. lrecl, recfm, etc.) for good measure to cover all cases?
For example when I read the gz file using the vanilla code below I get truncation issues. But when I use your code with all the options everything seems to load nicely. Is it best practice to always use the options you provided?
Thanks
data _null_; infile fromzip; file target ; input; put _infile_ ; run;
If you don't tell SAS to use RECFM=F or N then it will default to treating the file as lines of text.
And a SAS dataset is NOT lines of text.
@spirto wrote:
Thanks Tom. And I am guessing the record length of 256 assumes 'worst case' scenario and the code later sets the actual length?
Actually the variable defined by the LENGTH= option is set to the actual number of bytes read. So the first N-1 blocks read will be exactly 256 bytes. It is just that last block that might not be the full 256 bytes.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.