07-25-2013 08:53 AM
I have something like this in a huge text file and need to read it into four variables: Seq_Num, ID_1, ID_2, ID_3.
The data elements for the four variables are separated by commas (,) while observations are separated by semi-colons (. The lines in the text file seem to break at the limit (1024) of the txt application and may not break between observations (such as the 9th, 34th, and 51st observations). They may even break in the middle of a data element (such as the 26th and 43rd observations).
I did more digging. It seems that the data turns to the next line only by the limit of the txt application, and when the "infile" statement reads the data, the lines are considered continuous. When I specified lrecl=32760, the infile statement seems to have worked fine until the end of that length, but then skip a large chunk of data and restart. So it seems that the issue is how to allow the infile statement to continue to read the data as if there is no break. Is there a way to specify unlimited length?
Can someone help me with this?
Thanks a lot!
07-25-2013 10:44 AM
My first thought was to see if you can define your "end-of-line" character. I found:
INFILE TERMSTR option:
Valid TERMSTR= values are:
CRLF (Carriage Return Line Feed) - use TERMSTR=CRLF to
read Windows formatted files. CRLF is the default.
on Windows platforms
LF (Line Feed) - use TERMSTR=LF to read UNIX formatted
files. This is the default on UNIX systems.
CR (Carriage Return) - Use TERMSTR=CR to read MAC formatted files.
So, it doesn't look like you can define your own character. Is there any chance you can work with the sender of the data to use a "Char(10)" = LineFeed or "Char(13)" = CarriageReturn instead of the semi-colon? (In SAS, one can specify a variable LF = '10'x; or CR = '13'x ;
07-25-2013 10:52 AM
In fact I used lrecl=32760000 and dlm=',;' and used @@ at the end of the input statement, and the code worked.
Obviously lrecl=32760000 covers the entire length of the text file (25mb!) is within the limit of the infile statement.