Hi everyone,
I got hundreds of file need to unzip. There are two major problems;
1, input: right now, 'input;' does not work, only work when you specify the variables and length and format, this causes lots of trouble, as we have to re-define all the variables and format. Especially when dealing with different table, it's a disaster.
I hope to have a simple command to universally treat input;
2, Any way to unzip a batch of files? Got hundreds, batch by batch and I don't want to do it one by one.
Thanks.
So you don't have ZIP files at all? Just GZIP files? If so that makes it much easier to read all of the files in one data step. First get a list of the files into a dataset. For example by reading the output of dir command (or ls or find on unix) command.
data files;
infile 'dir C:\Logs\SEGuide_log.*.txt.gz /b' pipe truncover;
input filename $256. ;
run;
Now use that list to drive the data step that reads the files.
data logdata;
set files;
fname=filename ;
infile logfile zip gzip filevar=fname end=eof;
do while (not eof);
input date : yymmdd10. timestamp : anydttme. ;
output;
end;
format date date9. timestamp timeampm.;
run;
filename fromzip ZIP "C:\Logs\SEGuide_log.10168.txt.gz" GZIP; data logdata; infile fromzip; /* read directly from compressed file */ input date : yymmdd10. time : anydttme. ; format date date9. time timeampm.; run;
Hi Reeza, thanks for you advice.
1, the above code is is similar to what I used. I will try proc import.
2, I thought about it, if there is no better choice, macro is the only way I guess.
So you don't have ZIP files at all? Just GZIP files? If so that makes it much easier to read all of the files in one data step. First get a list of the files into a dataset. For example by reading the output of dir command (or ls or find on unix) command.
data files;
infile 'dir C:\Logs\SEGuide_log.*.txt.gz /b' pipe truncover;
input filename $256. ;
run;
Now use that list to drive the data step that reads the files.
data logdata;
set files;
fname=filename ;
infile logfile zip gzip filevar=fname end=eof;
do while (not eof);
input date : yymmdd10. timestamp : anydttme. ;
output;
end;
format date date9. timestamp timeampm.;
run;
Thanks for your advice Tom. yeah, not zip file, but .gz file.
First step indeed generate a list of files I want.
The problem is second step.
1, All the data store in 'DATA' folder, which I only have access to read, not write;
2, No matter I create a folder and use libname , or just under sas work file. The error message is always:
ERROR: Open failure for C:\WINDOWS\SYSTEM32\filename during attempt to create a local file handle.
so I am not sure if this is a path problem (which does not link to orginial 'DATA' folder) or it's a unzip problem. thanks.
Please show the actual code run. And SAS log if possible.
The only reason that data step would even attempt to write somewhere would be if you messed up the DATA statement. Just use the exact code I used:
data want;
and it will create a dataset named WORK.WANT.
Sorry, it's under Safe Environment, so I could not copy anything out.
Regardless data want; or data libname.want;
the error message is the same as I wrote in last reply.
So do it step by step.
1) Make sure you can create a dataset.
data test1;
x=1;
run;
2) Make sure you can read one of the gzip files. Like your original example.
data test2;
infile '.....txt.gz' zip gzip ;
input ....;
run;
3) Now try to read it using FILEVAR= option.
data test3;
fname='.....txt.gz';
infile dummy zip gzip filevar=fname;
input ....;
run;
4) Make a dataset with one filename and try reading it using that.
data test4;
filename='.....txt.gz';
run;
data test5;
set test4;
fname=filename;
infile dummy zip gzip filevar=fname end=eof;
do while (not eof);
input ....;
output;
end;
run;
Now replace TEST4 dataset with a larger list of filenames and then re-run the step to use it to create TEST5.
Thank you very much Tom.
I tested run your (step by step) code below as well. I found that the step-by-step codes give me file name as full name with 'path'.
However, in this solution code, the first step data all the filenames, but the data removed all the path to these file.
so in second step, I added the path as
fname= '&path\'||filename. Everything works perfectly.
One of the "features" of the Window/DOS command.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.