Desktop productivity for business analysts and programmers

Import raw text file

Reply
Contributor
Posts: 22

Import raw text file

Hi Experts,

 

I am trying to import a bad text raw data file in SAS EG using Infile statement. The file has 100,000 rows but when I import it imports only 2000 rows. I looked into the text file and saw that 2001 row there is a ‘arrow’ sign in the middle of the data. May be that’s the reason it’s not importing from this row onwards.

 

Is there any way we can import all the data even if there are some characters..?

 

Thanks

Super User
Super User
Posts: 7,720

Re: Import raw text file

You could try pre-processing the file to exclude any characters you don't need, something like:

data _null_;
     infile "C:\test.txt" recfm=n;
     file "C:\NEW_Test.txt" recfm=n;
     input a $char1.;
     put compress(a,"","knpu");
     put a $char1.;
run;

So the file is read one character at a time.  The compress should keep (k) numeric+chars (n), punctuation (p), and uppercase (u), and then write that out again.  You can then read in the NEW_Test.txt file without special characters. 

Respected Advisor
Posts: 3,156

Re: Import raw text file

If your SAS is running on Windows, there is another possibility that your text file has embedded with  'end of file' unprinted symbol, namely '1A'x. In this case, you need to tell SAS to ignore it:

infile test ignoredoseof;

  

 

Contributor
Posts: 22

Re: Import raw text file

Thanks. It worked Smiley Happy

 

So is this the unprinted sign and we don't see it in the file? That is tricky..how do we know that all the data is not getting imported because of some wierd character or is it odd sign/symbol or because of end of file symbol?

Respected Advisor
Posts: 3,156

Re: Import raw text file

I am afraid that there really is no easy programmable way to tell. I would reach out to the data provider get some metadata information, at least on ballpark level, such as how many records, fields in total etc, and understanding how the data is generated also helps, for instance, if you know you are get whole year of data that is collected on monthly basis, then there is a chance of your having 'end of file' symbol embedded.

Ask a Question
Discussion stats
  • 4 replies
  • 294 views
  • 1 like
  • 3 in conversation