Hi all,
I have a program which reads in 100+ CSV raw data files.
As a check, I'd like to set up the following:
An output that includes:
1. The total # of raw CSV files in the directory
2. The number of CSV files in the directory which were read in by the program
3. Whether the resulting data file had 0 records
I know I can get at #1 using a pipe statement; less sure how to get 2 and 3 and combine them into a single output.
Is this possible?
Any help is much appreciated.
You can get the filename of the infile currently being read using the FILENAME option, e.g.
data raw;
infile 'c:\dir\*.csv' filename=fnam <and other options, like delimiter>;
csvfile=fnam; /* the FILENAME variable is not saved to output */
input <your input statement>;
run;
You can then check against the directory you got with the pipe (or use DOPEN and DREAD, that's more portable), and see how many records came from each file.
Another possibility is to use the FILEVAR and EOF options to read the files - assuming the directory scan data is WORK.FILES, you can do something like this:
data raw(drop=noofrecords) linesread(keep=noofrecords <filename variable>);
set files;
infile dummy filevar=<filename variable> eof=done <and delimiters etc>;
do noofrecords=0 by 1;
input <your input statement>;
output raw;
end;
done: /* EOF makes the program jump here when there are not more records */
output linesread;
run;
The reason why the EOF option is used here is that there will also be an output for empty files (noofrecords=0) this way.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.