Hi all,
I have a program which reads in 100+ CSV raw data files.
As a check, I'd like to set up the following:
An output that includes:
1. The total # of raw CSV files in the directory
2. The number of CSV files in the directory which were read in by the program
3. Whether the resulting data file had 0 records
I know I can get at #1 using a pipe statement; less sure how to get 2 and 3 and combine them into a single output.
Is this possible?
Any help is much appreciated.
You can get the filename of the infile currently being read using the FILENAME option, e.g.
data raw;
infile 'c:\dir\*.csv' filename=fnam <and other options, like delimiter>;
csvfile=fnam; /* the FILENAME variable is not saved to output */
input <your input statement>;
run;
You can then check against the directory you got with the pipe (or use DOPEN and DREAD, that's more portable), and see how many records came from each file.
Another possibility is to use the FILEVAR and EOF options to read the files - assuming the directory scan data is WORK.FILES, you can do something like this:
data raw(drop=noofrecords) linesread(keep=noofrecords <filename variable>);
set files;
infile dummy filevar=<filename variable> eof=done <and delimiters etc>;
do noofrecords=0 by 1;
input <your input statement>;
output raw;
end;
done: /* EOF makes the program jump here when there are not more records */
output linesread;
run;
The reason why the EOF option is used here is that there will also be an output for empty files (noofrecords=0) this way.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Select SAS Training centers are offering in-person courses. View upcoming courses for: