BookmarkSubscribeRSS Feed
Jumboshrimps
Obsidian | Level 7

Broke the Internet looking for the answer to this one... or at least my patience.

Using infile to import couple thousand text files. 

 

Data Want;
infile "data/tmz/abcd/ef/ghij/klmnop/qrst/pvwxyz/SURR*.txt" DSD DELIMITER='|' eov=eov ;

input var1 $ var2 $ var3 $ var4 $;
run;

Works fine.

 

However,<some> text files are empty, just blank.  Zero bytes.  There is a unique name (naturally)

for each text file, embedded in that name is data I can use, even if the text file contains no data.

 

I am just a humble SAS programmer - if that.  I can't do anything about these files with

no data, but I do have to report that XXX numbers of files were submitted and my program

processed XXX number of files - and those numbers better match, whether there is data in 

those files or not.  Many, many, many posts on stackexchange, stackoverflow, various University sites,

to SKIP over empty csv/txt files, but NOT to process them (SAS, by default does not process them).

 

 

Below is a portion of the log processing 3,000 or so files where an empty text file was found: 

 

NOTE: The infile "data/tmz/abcd/ef/ghij/klmnop/qrst/pvwxyz/SURR*.txt" is:
Filename=/data/tmz/abcd/ef/ghij/klmnop/qrst/pvwxyz/SURR_XXX_xxx.txt, <Need the file name,only
File List=/data/tmz/abcd/ef/ghij/klmnop/qrst/pvwxyz/SURR_XXX_xxx.txt,
Owner Name=WhoseUrDaddy,Group Name=AA,
Access Permission=-rw-rw-r--,
Last Modified=28Aug2019:09:38:56,
File Size (bytes)=0

 

 

This is the Linux version of SAS 9.4, but the same result in Windows 10 9.4.

 

All I want is to check if the file has data or not, if not, append the filename to the dataset. 

If i add the filename to the existing infile -
length filename $77.; for example, 

SAS will add the filename of those txt files with records

but, again, skip over the empty files.

 

I am looking for code to:

A) check if the txt file has data - _n_ = 0, if the file has data, run infile var1/var2/var3,etc.

 

B) If the text file has a file size of zero bytes - extract the filename only - something like below:

 

rc=filename("FILE","data/tmz/abcd/ef/ghij/klmnop/qrst/pvwxyz/SURR*.txt");
fid=fopen("FILE");
infonum=foptnum(fid);
do i=1 to infonum;
infoname=foptname(fid,i);
infoval=finfo(fid,infoname);
output;
end;
close=fclose(fid);

 

I just don't know how to incorporate both these snipits of code into one SAS program.

We're talking about 3,200 files in this process.  

 

 

 

 

 

 

 

6 REPLIES 6
ballardw
Super User

There are dozens  of examples of getting file listings into a data set on the forum such as Filename PIPE with a directory listing command.

Perhaps that might get you the information you need though not a data set with 0 observations. Scrub that against the data sets filename as added.

 

Or with the same directory listing approach capture the file size and select those with 0 bytes (or appropriate reported size).

data_null__
Jade | Level 19

If you change the way you read the files to use INFILE option FILEVAR you can create a list of files as you read them, with an indicator of whether or not they have records.

 

filename FT15F001 'z1.txt';
parmcards;
a
b
;;;;
filename FT15F001 'z2.txt';
parmcards;
;;;;
filename FT15F001 'z3.txt';
parmcards;
c
d
;;;;


data driver;
   cmd = 'dir /b z*.txt';
   infile dummy pipe filevar=cmd end=eof;
   do while(not eof);
      input filename &$128.;
      put _infile_;
      output;
      end;
   stop;
   run;
proc print;
   run;

data z(keep=x) files(keep=filename zerorecs);
   set driver;
   filevar=filename;
   length fname $128;
   infile dummy filevar=filevar end=eof filename=fname;
   putlog fname= eof=;
   filename=fname;
   zerorecs=eof;
   output files;
   do while(not eof);
      input x :$1.;
      output z;
      end;
   run;
proc print data=files;
   run;

Capture.PNG

Jumboshrimps
Obsidian | Level 7

I had tried the filename filelist pipe 'dir /b /s home/my/mother/the/car/SURR*.TXT' previously

but it doesn't work on SAS 9.4 running on Linux.

Result is below.

 

What about:  

 

DATA ASCIIFILES;
LENGTH FILENAME $55.;
rc=filename("FILENAME","\home\my\mother\the\car\SURR*.TXT");
did=dopen("FILENAME");
if did > 0 then do; <--if did = 0 insert filename only, don't skip over
num = dnum(did);
do i = 1 to num;
FILENAME = dread(did,i); 
EXT= substr(FILENAME,length(FILENAME)-2,3); \*identify txt files only -
other departments share this directory and put their junk in there /*
OUTPUT;
end;
RC=dclose(did);
end;
run;

 

 

results of Filename filelist pipe 'dir /b /s  /xxxx/xxxx/xxxx/SURR*.txtresults of Filename filelist pipe 'dir /b /s /xxxx/xxxx/xxxx/SURR*.txt

data_null__
Jade | Level 19

DIR is NOT a UNIX command use the proper UNIX command LS.

Tom
Super User Tom
Super User

How did you get that output using a command that doesn't exist?  Perhaps someone made an alias for you to convert dir into ls?

Try it this way instead, although are you sure that all of the files are using capital SURR at the beginning a capital TXT as the extension? Unix filenames are case sensitive.

infile 'ls -d /home/my/mother/the/car/SURR*.TXT' pipe truncover;
input filename $256.;

And this cannot work 

rc=filename("FILENAME","\home\my\mother\the\car\SURR*.TXT");
did=dopen("FILENAME");

Since on UNIX a \ just protects the next character.  So you asked it to open a file named:

homemymotherthecarSURR*.TXT

as if it was a directory.

Did you try this instead?

rc=filename("FILENAME","/home/my/mother/the/car/");
did=dopen("FILENAME");

 

Jumboshrimps
Obsidian | Level 7

Now that worked on the Linux SAS, replacing "dir" with "ls -d".

 

Thanx.

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1941 views
  • 3 likes
  • 4 in conversation