BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
supersasnewbie
Calcite | Level 5

Hello,

I received a request to review some reporting that my predecessor left behind years ago, and I'm not familiar at all with how to manage this. Within the "datamart" library, there is an item named "filename.sas7bdat.gz". To my understanding, this is a gzipped file which would contain some of the information that I need to view in order to (hopefully) retrace the steps used to build this report. My predecessor also left some code, as follows, which I've run but does not actually open any tables, or seem to do anything at all. Help!

 

/*Unzip SAS7BDAT File*/
x "gunzip &datamart/filename.sas7bdat";

run;

1 ACCEPTED SOLUTION

Accepted Solutions
ChrisHemedinger
Community Manager

A gz file is a compressed single file, usually on a Unix/Linux system. In this case you have a SAS data set file that has been compressed with gzip.

 

gunzip is a tool that can uncompress  it. You can also accomplish this directly in SAS code with the FILENAME ZIP method and GZIP option. See this article for background and examples.

 

Example:

/* The expands the GZ data to a WORK data set */
filename zipdata ZIP "&datamart/filename.sas7bdat.gz" GZIP;
filename unzip "%sysfunc(getoption(WORK))/filename.sas7bdat";
 
data _null_;
   infile zipdata
       lrecl=256 recfm=F length=length eof=eof unbuf;
   file unzip lrecl=256 recfm=N;
   input;
   put _infile_ $varying256. length;
   return;
 eof:
   stop;
run;
 
proc print data=work.filename(obs=5);
run;

 

SAS Hackathon registration is open! Build your skills. Make connections. Enjoy creative freedom. Maybe change the world.

View solution in original post

8 REPLIES 8
ChrisHemedinger
Community Manager

A gz file is a compressed single file, usually on a Unix/Linux system. In this case you have a SAS data set file that has been compressed with gzip.

 

gunzip is a tool that can uncompress  it. You can also accomplish this directly in SAS code with the FILENAME ZIP method and GZIP option. See this article for background and examples.

 

Example:

/* The expands the GZ data to a WORK data set */
filename zipdata ZIP "&datamart/filename.sas7bdat.gz" GZIP;
filename unzip "%sysfunc(getoption(WORK))/filename.sas7bdat";
 
data _null_;
   infile zipdata
       lrecl=256 recfm=F length=length eof=eof unbuf;
   file unzip lrecl=256 recfm=N;
   input;
   put _infile_ $varying256. length;
   return;
 eof:
   stop;
run;
 
proc print data=work.filename(obs=5);
run;

 

SAS Hackathon registration is open! Build your skills. Make connections. Enjoy creative freedom. Maybe change the world.
supersasnewbie
Calcite | Level 5

Thanks for the response. Here's what I did, where "Import" is my library that corresponds to "drivename". However, at the proc print step, I am getting the error message 'IMPORT.filename.data is shorter than expected use PROC DATASETS; REPAIR to fix it. What should I do to fix this?

 

filename zipdata ZIP "drivename/filename.sas7bdat.gz" GZIP;
filename unzip "drivename/filename.sas7bdat";
 
data _null_;
   infile zipdata
       lrecl=256 recfm=F length=length eof=eof unbuf;
   file unzip lrecl=256 recfm=N;
   input;
   put _infile_ $varying256. length;
   return;
 eof:
   stop;
run;
 
proc print data=import.filename(obs=5);
run;

 

Tom
Super User Tom
Super User

Try actually using gunzip to expand the file and see if that works.  If it does not work then the original gz file is corrupted.

 

ErikLund_Jensen
Rhodochrosite | Level 12

Hi @supersasnewbie 

 

As it says, use Proc datasets with Repair:

 

libname mylib "c:\testdata";

proc datasets lib=mylib;
   repair myfile;
run;
quit;

In most cases, running a Proc Content on the library will also do the trick, but Proc Datasets is the secure way.

spirto
Obsidian | Level 7

@ChrisHemedinger  Are all the options you added to data _null_ statement (e.g. lrecl, recfm, etc.) for good measure to cover all cases?

 

For example when I read the gz file using the vanilla code below I get truncation issues. But when I use your code with all the options everything seems to load nicely. Is it best practice to always use the options you provided?

 

Thanks

 

data _null_;   
   infile fromzip;
   file target ;
   input;
   put _infile_ ;
run;

 

Tom
Super User Tom
Super User

If you don't tell SAS to use RECFM=F or N then it will default to treating the file as lines of text.

And a SAS dataset is NOT lines of text.

spirto
Obsidian | Level 7
Thanks Tom. And I am guessing the record length of 256 assumes 'worst case' scenario and the code later sets the actual length?
Tom
Super User Tom
Super User

@spirto wrote:
Thanks Tom. And I am guessing the record length of 256 assumes 'worst case' scenario and the code later sets the actual length?

Actually the variable defined by the LENGTH= option is set to the actual number of bytes read.  So the first N-1 blocks read will be exactly 256 bytes.  It is just that last block that might not be the full 256 bytes.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 3403 views
  • 0 likes
  • 5 in conversation