Reading from a text file

Reply
Contributor
Posts: 37

Reading from a text file


Hi,

I have something like this in a huge text file and need to read it into four variables: Seq_Num, ID_1, ID_2, ID_3.

The data elements for the four variables are separated by commas (,) while observations are separated by semi-colons (Smiley Wink.  The lines in the text file seem to break at the limit (1024) of the txt application and may not break between observations (such as the 9th, 34th, and 51st observations).  They may even break in the middle of a data element (such as the 26th and 43rd observations).

I did more digging.  It seems that the data turns to the next line only by the limit of the txt application, and when the "infile" statement reads the data, the lines are considered continuous.  When I specified lrecl=32760, the infile statement seems to have worked fine until the end of that length, but then skip a large chunk of data and restart.  So it seems that the issue is how to allow the infile statement to continue to read the data as if there is no break.  Is there a way to specify unlimited length?

Can someone help me with this?

Thanks a lot!

Jason

1,9,1090047839,1173374;2,9,1090097121,1173533;3,9,1090039710,1173605;4,9,1090110554,1173721;5,9,1090061886,1173808;6,9,1090042778,1173886;7,9,1090025342,1173896;8,9,1090061583,1173974;9,9,1090087933,

1174475;10,9,1090011511,1174481;11,9,1090015051,1174487;12,9,1090080342,1174542;13,9,1090078261,1174983;14,9,1090082986,1175071;15,9,1090083901,1175074;16,9,1090053914,1175268;17,9,1090092801,1175902;

18,9,1090009009,1175968;19,9,1090081788,1176024;20,9,1090048627,1176197;21,9,1090030477,1176199;22,9,1090063356,1176238;23,9,1090057049,1176271;24,9,1090021020,1176354;25,9,1090094964,1176363;2

6,9,1090070242,1176379;27,9,1090025309,1176475;28,9,1090088411,1176499;29,9,1090016169,1176929;30,9,1090099295,1177030;31,9,1090076688,1177256;32,9,1090035490,1177359;33,9,1090048985,1177387;34,9,1090036555,

1177531;35,9,1090077136,1177593;36,9,1090091925,1178252;37,9,1090038714,1178461;38,9,1090079303,1178463;39,9,1090037055,1178475;40,9,1090087386,1178726;41,9,1090017360,1179277;42,9,1090033361,1179523;4

3,9,1090029360,1179794;44,9,1090094748,1179987;45,9,1090007983,1180199;46,9,1090082367,1180427;47,9,1090088336,1180611;48,9,1090041137,1180665;49,9,1090017279,1180753;50,9,1090025309,1180790;51,9,1090077618,

1181076;52,9,1090031408,1181096;53,9,1090001569,1181382;54,9,1090063363,1181391;55,9,1090063308,1181481;56,9,1090005325,1181607;57,9,1090084647,1181632; ......

Contributor
Posts: 43

Re: Reading from a text file

My first thought was to see if you can define your "end-of-line" character.  I found:

INFILE TERMSTR option:

Valid TERMSTR= values are:

  CRLF (Carriage Return Line Feed) - use TERMSTR=CRLF to

       read Windows formatted files.  CRLF is the default.

       on Windows platforms

  LF   (Line Feed) - use TERMSTR=LF to read UNIX formatted

       files.  This is the default on UNIX systems.    

  CR   (Carriage Return) - Use TERMSTR=CR to read MAC formatted files.

So, it doesn't look like you can define your own character.  Is there any chance you can work with the sender of the data to use a "Char(10)" = LineFeed or "Char(13)" = CarriageReturn instead of the semi-colon?  (In SAS, one can specify a variable LF = '10'x; or CR = '13'x ;

Contributor
Posts: 37

Re: Reading from a text file

Thanks Wilson.

In fact I used lrecl=32760000 and dlm=',;' and used @@ at the end of the input statement, and the code worked.

Obviously lrecl=32760000 covers the entire length of the text file (25mb!) is within the limit of the infile statement.

Thanks everyone.

Jason


Contributor
Posts: 43

Re: Reading from a text file

ah, it's good to know that 32767 is no longer the maximum value you can specify.

Ask a Question
Discussion stats
  • 3 replies
  • 402 views
  • 0 likes
  • 2 in conversation