BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
AdamMadison
Fluorite | Level 6

Hello,

 

I have the names of files as observations in a dataset and I would like to loop through them to pipe the filenames into my next step.

For more context, I have the dataset in the image below (work.contents). And I want to pipe the contents (just the name) of obs 1 ("
") into:

 

 

filename xl "%sysfunc(getoption(work))/NSDQsh20100802.txt" ;
data _null_;
   /* using member syntax here */
   infile inzip(NSDQsh20100802.txt) 
       lrecl=256 recfm=F length=length eof=eof unbuf;
   file   xl lrecl=256 recfm=N;
   input;
   put _infile_ $varying256. length;
   return;
 eof:
   stop;
run;

 

where NSDQsh20100802.txt is the content of obs 1. After that is is complete, then do the same for obs 2, obs3 ...

 

 

 

 

Files.PNG

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

I don't expect the following to be the final solution for you but it should give you the pointers you need.

The code below is based on Chris H's blog which given your post you've found already.

I'm also attaching the sample zip file I've used so that below sample code becomes fully workable.

 

The code allows you to read a member in a zip file into SAS, "do stuff" with it, and then collect the result in a WANT dataset - and then repeat the process for the next member in the zip file.

/* list members in zip file */
%macro zipMemList(source=, outds=memlist);
  /* Assign a fileref wth the ZIP method */
  filename inzip zip "&source";

  /* Read the "members" (files) from the ZIP file */
  data &outds(keep=zip memname);
   length zip $200 memname $200;
   zip="&source";
   fid=dopen("inzip");
   if fid=0 then
    stop;
   memcount=dnum(fid);
   do i=1 to memcount;
    memname=dread(fid,i);
    output;
   end;
   rc=dclose(fid);
  run;

  filename inzip clear;
%mend;
 
/* read member in zip file into SAS dataset */
%macro ReadMemInZip(source=, member=, outds=);
  /* Assign a fileref wth the ZIP method */
  filename inzip zip "&source";

  /* Import a text file directly from the ZIP */
  data _tmp;
   infile inzip(&member) 
     firstobs=1 dsd dlm=',';
   input 
    (var1-var3) ($) var4;
  run;

  /* append to want dataset */
  proc datasets lib=work nolist;
    append base=&outds data=_tmp;
    run;
    delete _tmp;
    run;
  quit;

  filename inzip clear;
%mend;


/* create dataset with members in specific zip file */
%zipMemList(source=~/test/files.zip, outds=memlist);

/* create dataset with all members in all zip files in a specific folder 
   - macro code here:
    https://blogs.sas.com/content/sasdummy/2016/10/16/filename-zip-list-file-contents/
*/
/*%listzipcontents (targdir=~/test, outlist=memlist);*/


/* execute macro %ReadMemInZip() once per member in zip file */
data _null_;
  set memlist;
  cmd=cats('%ReadMemInZip(source=',zip,', member=',memname,', outds=want)');
  call execute(cmd);
run;

title 'data combined';
proc print data=want;
run;
title;

 

View solution in original post

6 REPLIES 6
GGO
Obsidian | Level 7 GGO
Obsidian | Level 7

Here's a clever example that could be a good guide for you:

 

http://pharma-sas.com/a-sas-macro-to-combine-portrait-and-landscape-rtf-files-into-one-single-file/

 

The author is looping through RTF files (see "filename rtffiles (&rtffiles);")

Then this snippet cycles through each file:

 

  do until (eof);
    infile rtffiles lrecl=32767 end=eof filevar=fileloc;
    input;
    rtfcode=_infile_;
    *--- implement your logic ;
  end;

It's a clever bit of code. Lot's of techniques to learn in that posting.

Tom
Super User Tom
Super User

Sounds like you are trying to expand the ZIP file?

Why not just run an operating system command to unzip the file?  Such as the unzip command.

data _null_;
  infile "cd %sysfunc(pathname(work));unzip %sysfunc(pathname(inzip))" pipe;
  input;
  put _infile_;
run;

 

Tom
Super User Tom
Super User

What is that you are trying to do? What do you actually have?

It looks like you have a dataset named CONTENTS with a variable named MEMNAME that that has a list of filenames.

Do you want to use that to run that little data step?  

 

A simple pattern is to convert the code to macro definition. Then call the macro once for each value in the CONTENTS dataset.

 

Is that what you want to do?  

AdamMadison
Fluorite | Level 6

Thank you all for your comments. 

 

I have currently have a lot of zipfiles (~120). And each zip file contains about 22 txt files. What I ultimately want to do is import each of the txt files. They are very large txt files, so my plan of attack is to loop through the following process:

1).  Unzip the first txt file in the current zip (file 1)

2). Import file (file 1)

3). Process file (file 1) to shrink its size

.

.

1) Unzip the second txt file in the current zip (file 2)

2). Import file (file 2)

3). Process file (file 2) to shrink its size

.

.

1) Unzip the nth txt file in the current zip (file n)

2). Import file (file n)

3). Process file (file n) to shrink its size

.

.

Finally append datasets (1,2,..n)

Then move on to the next zip file and do then same routine.

 

I hope this clears up my intentions.

 

What I currently have done is create a dataset (CONTENTS), that collects the names of the txt files in the current zip file. I think if I can loop through the filenames, I can run the routine above. 

 

Patrick
Opal | Level 21

I don't expect the following to be the final solution for you but it should give you the pointers you need.

The code below is based on Chris H's blog which given your post you've found already.

I'm also attaching the sample zip file I've used so that below sample code becomes fully workable.

 

The code allows you to read a member in a zip file into SAS, "do stuff" with it, and then collect the result in a WANT dataset - and then repeat the process for the next member in the zip file.

/* list members in zip file */
%macro zipMemList(source=, outds=memlist);
  /* Assign a fileref wth the ZIP method */
  filename inzip zip "&source";

  /* Read the "members" (files) from the ZIP file */
  data &outds(keep=zip memname);
   length zip $200 memname $200;
   zip="&source";
   fid=dopen("inzip");
   if fid=0 then
    stop;
   memcount=dnum(fid);
   do i=1 to memcount;
    memname=dread(fid,i);
    output;
   end;
   rc=dclose(fid);
  run;

  filename inzip clear;
%mend;
 
/* read member in zip file into SAS dataset */
%macro ReadMemInZip(source=, member=, outds=);
  /* Assign a fileref wth the ZIP method */
  filename inzip zip "&source";

  /* Import a text file directly from the ZIP */
  data _tmp;
   infile inzip(&member) 
     firstobs=1 dsd dlm=',';
   input 
    (var1-var3) ($) var4;
  run;

  /* append to want dataset */
  proc datasets lib=work nolist;
    append base=&outds data=_tmp;
    run;
    delete _tmp;
    run;
  quit;

  filename inzip clear;
%mend;


/* create dataset with members in specific zip file */
%zipMemList(source=~/test/files.zip, outds=memlist);

/* create dataset with all members in all zip files in a specific folder 
   - macro code here:
    https://blogs.sas.com/content/sasdummy/2016/10/16/filename-zip-list-file-contents/
*/
/*%listzipcontents (targdir=~/test, outlist=memlist);*/


/* execute macro %ReadMemInZip() once per member in zip file */
data _null_;
  set memlist;
  cmd=cats('%ReadMemInZip(source=',zip,', member=',memname,', outds=want)');
  call execute(cmd);
run;

title 'data combined';
proc print data=want;
run;
title;

 

Tom
Super User Tom
Super User

Are the files all supposed to be in the same format?  If so you can read them all at once. You really don't want to use PROC IMPORT for text files. Especially if you are reading a lot of files in the same format.  It will have to GUESS how to define the variables and is likely to define the dataset differently for every file.

 

How are you making them smaller?  Can you do it in the same step that is reading the text?

data want;
  infile '/path to file/filename.zip' zip member='*' truncover ;
  input ... ;
  if (...) then delete;
run;

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1959 views
  • 5 likes
  • 4 in conversation