BookmarkSubscribeRSS Feed

How do I write a macro to...import multiple text files that have the same format?

Started ‎02-13-2015 by
Modified ‎02-11-2019 by
Views 25,351

This is part two of an ongoing series into how to accomplish a task WITHOUT using a macro.

In the previous post, I split a file without using a macro. In this post, I go over how

read in text files that have the same format and create a single file.

 

One method of doing so is:

  1. Use the OS to generate a list of file names
  2. Write the code to read a single file
  3. Convert that to a macro
  4. Read each file in using the macro and append the results

 

Instead of this method, SAS provides a wildcard method of referring to files.

 

This means if my file reference looks like:

 

 

"/folders/myfolders/*.csv"

 

 

SAS will read all CSV folders in the folder myfolders.

 

You can limit your wild card as well using a more specific naming convention

For example, the follow line of code will import all files that start with the word cars:

 

 

"/folders/myfolders/cars*.csv"

 

 

 

However there are some things to consider:

  • Headers in each file
  • How do I know which record a file comes from

 

Fortunately SAS also provides options for those, EOV allows us to identify the first

record in a file and FILENAME allows us to identify the file name of the file imported.

 

Enough talking, here's what the full code would look like. Hope you find it useful!

 

 

 

data import_all;
 
*make sure variables to store file name are long enough;
length filename txt_file_name $256;
 
*keep file name from record to record;
retain txt_file_name;
 
*Use wildcard in input;
infile "Path\*.txt" eov=eov filename=filename truncover;
 
*Input first record and hold line;
input@;
 
*Check if this is the first record or the first record in a new file;
*If it is, replace the filename with the new file name and move to next line;
if _n_ eq 1 or eov then do;
txt_file_name = scan(filename, -1, "\");
eov=0;
delete; end;   *Otherwise  go to the import step and read the files; else input   *Place input code here;   ; run;

 

 

This code is also available on GitHub

Comments

Nice material . Macro programming is often used when it is not really needed. Code writing code is harder to read and to maintain.

In your example,why txt_file_name = scan(filename, -2, ".\") ? File names can contain zero, one or more than one period characters. Wouldn't txt_file_name = scan(filename, -1, "\") be simpler and still do the job?

PG

Thanks Pierre.  You are correct and I'll modify the code to reflect this.

In general, I avoid periods in my file names but I guess others don't follow that rule Smiley Happy

Hi,

 

could you please specify what you mean by 

*Place input code here;

 What I did was simply input the names of the variables but in the data step they were empty 

ps: I want to import xls files.

 

Thank you

 @Reeza    Thank you for this useful macro.  However, I feel  there should be a DELETE  statement   within the  IF block in the above  data step  so that a blank row (which has missing values for all columns  except the txt_file_name  column  ) in place of the header row is not written to the output dataset.

 

if _n_ eq 1 or eov then do;
txt_file_name = scan(filename, -1, "\");
eov=0;
delete;
end;

Thanks! simple and useful.

Hello Reeza,

I'm trying to use the code above and have adapted as below but cannot figure out why is not working. All files read have zero records and I thought this was related to name of the variables, may be due to variable 'filename' is in both data steps?

I greatly appreciate your help to fix this.

 

Filename filelist pipe "dir /b /s C:\Documents\*.dat";
Data filelist;
Infile filelist truncover;
Input filename $150.;
Put filename=;
Run;

Proc Print data=filelist (obs=2);

run;

filename
C:\Documents\2020\01\20\20200120-064931-DataLog_User.dat
C:\Documents\2020\01\22\20200122-031029-DataLog_User.dat


data import_all;
SET filelist;
length filename txt_file_name $256;
retain txt_file_name;

infile my_file eov=eov filevar=filename filename=filename truncover;

input@;

file my_file 'myfile.txt';

if _n_ eq 1 or eov then do;
txt_file_name = scan(filename, -1, "\");
eov=0;

delete;
end;

else input X Y Z;
run;

Version history
Last update:
‎02-11-2019 02:24 PM
Updated by:
Contributors

sas-innovate-2024.png

📢

ANNOUNCEMENT

The early bird rate has been extended! Register by March 18 for just $695 - $100 off the standard rate.

 

Check out the agenda and get ready for a jam-packed event featuring workshops, super demos, breakout sessions, roundtables, inspiring keynotes and incredible networking events. 

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Labels
Article Tags