Hello
I want to read a zipped file. but the file is too big (about 400 variables). i just need some variables (20) in this file. is it possible to include the Keep= procedure when extracting the Zip file.
Here is the code I use :
filename inzip ZIP "D:\MesDonnee\BD_Prog_SAS\medecv2.zip";
filename laboHier "%sysfunc(getoption(work))/medecv2.sas7bdat" ;
data _NULL_;
infile inzip(medecv2.sas7bdat) lrecl=256 recfm=F length=length eof=eof unbuffered ;
file laboHier lrecl=256 recfm=N;
*to write the data generated by infile, i.e. cohort_Dm.sas7bdat;
input;
put _infile_ $varying256. length;
return;
eof:
stop;
run;
No.
You have a ZIP file that contains a SAS dataset (.sas7bat).
For SAS to be able to read the file it needs to be an actual file.
So you need to first extract the file from the ZIP file into a physical dataset file. If your WORK disk does not have enough room then expand it to some other disk.
You can then copy the variables you want into another SAS dataset and delete the extracted file.
If the file is in the WORK library then just recreating it will delete the version copied from the ZIP file.
Just list the variables you need in the KEEP= option.
filename inzip ZIP "D:\MesDonnee\BD_Prog_SAS\medecv2.zip"
member="medecv2.sas7bdat"
recfm=f lrecl=256
;
filename laboHier "%sysfunc(getoption(work))/medecv2.sas7bdat"
recfm=f lrecl=256
;
%put Return code = %sysfunc(fcopy(inzip,labohier));
data medecv2;
set medecv2(keep=var1 var2);
run;
Thank you for your answer. In fact what made me ask the question is that it takes a lot of time to extract all the variables so I only need about 20.
@IdrissaO wrote:
Thank you for your answer. In fact what made me ask the question is that it takes a lot of time to extract all the variables so I only need about 20.
The issue is that it takes a long time to do anything with a file that large. If the data in the ZIP file was a text file, like a CSV file then you could write a data step that only creates the variables you need. SAS does not need to first create a physical file to read a text file. But the step would still have to READ the whole file. There is no way to read the 10th line without reading the 9 lines before it or to read the 10th value on a line without reading past the 9 values before it.
Not during the zip extracting process - you need to expand the entire SAS data set file. Use a separate DATA step to create just the subset you need after extracting.
filename inzip ZIP "D:\MesDonnee\BD_Prog_SAS\medecv2.zip";
filename laboHier "%sysfunc(getoption(work))/medecv2.sas7bdat" ;
data _NULL_;
infile inzip(medecv2.sas7bdat) lrecl=256 recfm=F length=length eof=eof unbuffered ;
file laboHier lrecl=256 recfm=N;
*to write the data generated by infile, i.e. cohort_Dm.sas7bdat;
input;
put _infile_ $varying256. length;
return;
eof:
stop;
run;
data want;
set medecv2(keep=var1-var20);
run;
Thank you for your answer.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.