SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

prevent reading blank rows in csvs

Accepted Solution Solved
Reply
Contributor
Posts: 26
Accepted Solution

prevent reading blank rows in csvs

I am using the following code to read multiple csvs.

http://www.ats.ucla.edu/stat/sas/faq/multi_file_read.htm

But, my csvs have blank rows in them.  Any suggestions on how to modify this code to prevent reading the blank rows. The blank rows are making it take a long time to run. 

 I am currently doing it using a second data step with this line of code

if compress(cats(of _all_),'.')=' ' then delete; /*This line was also taken from SAS communites.  Thanks!*/

 


Accepted Solutions
Solution
‎02-16-2017 01:24 PM
Respected Advisor
Posts: 3,892

Re: prevent reading blank rows in csvs

[ Edited ]

When using the INPUT statement the automatic variable _INFILE_ gets created. This variable points to the input buffer.

For an blank line in your source .csv the input buffer will be "blank" and the following code should work.

data dirlist;
  length fname $256;
  infile dirlist length=reclen;
  input fname $varying256. reclen;
  if missing(_INFILE_) then delete;
run;

 

View solution in original post


All Replies
Super Contributor
Posts: 251

Re: prevent reading blank rows in csvs

It could make a difference depending on the file system and structure of the CSV files themselves, but you could try something like this:

if reclen < 2 then
   delete;

If you've used the length=reclen keyword of infile, and your files have variable length records, the blank lines will have a length of 0 or 1 (probably the former).

Contributor
Posts: 26

Re: prevent reading blank rows in csvs

LaurieF,

 

Where would I put this piece of code in the code in the link?

Super Contributor
Posts: 251

Re: prevent reading blank rows in csvs

[ Edited ]

It's hard to tell, because I can't see your exact code!

 

If it's exactly as the link has it, I'd modify it to this:

 

 

do while(not done);
    myfilename = filepath;
    input @;
    if reclen >= 2 then do;
       input name $  x1 x2 x3 @;
       output;
       end;
    input;
  end;

 

The input @; will read the record but not process it and not release it; nevertheless it populates reclen. If the length is 2 or more, read the fields you want and write out an observation. Again the trailing  @ will not release the record. At the end of the do block, the record is again processed as a null process, but releases it so that the next input will carry on correctly.

 

I should think that this will give you quite an improvement, especially if you've got a lot of variables to read.

Solution
‎02-16-2017 01:24 PM
Respected Advisor
Posts: 3,892

Re: prevent reading blank rows in csvs

[ Edited ]

When using the INPUT statement the automatic variable _INFILE_ gets created. This variable points to the input buffer.

For an blank line in your source .csv the input buffer will be "blank" and the following code should work.

data dirlist;
  length fname $256;
  infile dirlist length=reclen;
  input fname $varying256. reclen;
  if missing(_INFILE_) then delete;
run;

 

Valued Guide
Posts: 797

Re: prevent reading blank rows in csvs

I think you mean _INFILE_, not _input_

Respected Advisor
Posts: 3,892

Re: prevent reading blank rows in csvs

I do and fixed in my post. Thanks!

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 311 views
  • 3 likes
  • 4 in conversation