I am using the following code to read multiple csvs.
http://www.ats.ucla.edu/stat/sas/faq/multi_file_read.htm
But, my csvs have blank rows in them. Any suggestions on how to modify this code to prevent reading the blank rows. The blank rows are making it take a long time to run.
I am currently doing it using a second data step with this line of code
if compress(cats(of _all_),'.')=' ' then delete; /*This line was also taken from SAS communites. Thanks!*/
When using the INPUT statement the automatic variable _INFILE_ gets created. This variable points to the input buffer.
For an blank line in your source .csv the input buffer will be "blank" and the following code should work.
data dirlist;
length fname $256;
infile dirlist length=reclen;
input fname $varying256. reclen;
if missing(_INFILE_) then delete;
run;
It could make a difference depending on the file system and structure of the CSV files themselves, but you could try something like this:
if reclen < 2 then
delete;
If you've used the length=reclen keyword of infile, and your files have variable length records, the blank lines will have a length of 0 or 1 (probably the former).
LaurieF,
Where would I put this piece of code in the code in the link?
It's hard to tell, because I can't see your exact code!
If it's exactly as the link has it, I'd modify it to this:
do while(not done);
myfilename = filepath;
input @;
if reclen >= 2 then do;
input name $ x1 x2 x3 @;
output;
end;
input;
end;
The input @; will read the record but not process it and not release it; nevertheless it populates reclen. If the length is 2 or more, read the fields you want and write out an observation. Again the trailing @ will not release the record. At the end of the do block, the record is again processed as a null process, but releases it so that the next input will carry on correctly.
I should think that this will give you quite an improvement, especially if you've got a lot of variables to read.
When using the INPUT statement the automatic variable _INFILE_ gets created. This variable points to the input buffer.
For an blank line in your source .csv the input buffer will be "blank" and the following code should work.
data dirlist;
length fname $256;
infile dirlist length=reclen;
input fname $varying256. reclen;
if missing(_INFILE_) then delete;
run;
I think you mean _INFILE_, not _input_
I do and fixed in my post. Thanks!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.