Desktop productivity for business analysts and programmers

blank line as delimiter in text file

Reply
N/A
Posts: 1

blank line as delimiter in text file

I'm trying to get sas to read an ascii text file with records of variable length separated by blank lines (1 blank line between each record). I simply want to create a 1-column dataset of all records. The problem is that each record consists of many lines broken by carriage returns. I have tried variations of the following to no avail:

data raw;
infile 'c:\mydata.txt' truncover lrecl=99999 dlm='0D0D'X;
input raw $char9999.;
run;

Any ideas appreciated!
SAS Super FREQ
Posts: 8,820

Re: blank line as delimiter in text file

Hi!
I typed this data into Notepad, into a file called c:\temp\cr_blank.txt
[pre]
record 1 and even more stuff
still record 1
and this is also record 1

record2 record2 record2
more record2
guess what this is also record 2
and this too

just one line for record3

[/pre]
I made sure to put a carriage return at the end of every line and then put 1 hit of the space bar for my "blank" line separating each record.

Then, I used this program to read my file. You may have to change the logic a bit to read your file:
[pre]

data readraw (keep=record_num cntlines wholerec);
length wholerec $1000 inline $256;
retain wholerec record_num cntlines;
infile 'c:\temp\cr_blank.txt' length=lg end=eof ;
input @1 inline $varying. lg;
put lg= _infile_ ;
if _n_ = 1 then do;
cntlines = 0;
record_num=1;
end;
if lg gt 1 then do;
cntlines + 1;
wholerec = trim(wholerec)||' '||trim(inline);
end;
if inline = ' ' or lg eq 1 or eof then do;
output;
wholerec = ' ';
record_num + 1;
cntlines = 0;
end;
run;

ods listing close;
ods html file='c:\temp\readrec.html';
proc print data=readraw;
run;
ods html close;
[/pre]

The key is knowing that $varying. is an INFORMAT that you can use to read variable length records. When you use the LENGTH= option on your INFILE statement, SAS makes a variable (in this case, LG) that contains the "length" of the current record in the input buffer. So I don't have to worry about variable length records, because $varying allows SAS to read each "line". When I did the "put lg= _infile_;" statement, this is what I got in the log:
[pre]
lg=29 record 1 and even more stuff
lg=15 still record 1
lg=26 and this is also record 1
lg=1
lg=24 record2 record2 record2
lg=13 more record2
lg=33 guess what this is also record 2
lg=12 and this too
lg=1
lg=25 just one line for record3
lg=1
[/pre]

Notice how the "length" variable, LG, was 1 on every "blank" line that I had in my record. Even though I hit a Carriage Return (ENTER) key, I believe that it got translated to a space when SAS was reading the file or else it was treated as an "end of record" marker on my operating system (Windows).

You may have to fiddle with your LRECL= or the LENGTH statement for INLINE or WHOLEREC variables or with the logic of the code where the record is output, but I took a brute force approach and just concatenated every new "line" (represented by the INLINE variable) with the WHOLEREC variable and then only output the WHOLEREC variable with a "RECORD_NUM" variable and a CNTLINES variable -- CNTLINES represents the number of lines that went into making this new record.

There are probably some more elegant approaches, involving the use of the @@ to read the input or an explicit DO LOOP; however, I find that sometimes elegant code is harder to maintain and harder to explain.

If you try some variation of this technique and are still having problems, then your best bet is to contact Tech Support, because you could send them a copy of your input file on which they could test code. And they could collect any operating system information from you which may affect the kind of code that you need to write. To contact Tech Support, refer to: http://support.sas.com/techsup/contact/index.html

Good luck,
cynthia
Trusted Advisor
Posts: 2,114

Re: blank line as delimiter in text file

Cynthia,

You got lg=1 on your blank lines because you hit a single space before hitting the enter key.

"Leaving", be careful with '0D0D'x as a delimiter. Different OS's use different ways to determine the end of a line. That string would not appear in Windows, for instance, as it uses two characters to mark the end of a line ().

Doc Muhlbaier
Duke
SAS Super FREQ
Posts: 8,820

Re: blank line as delimiter in text file

Yes, I know that lg=1 because I hit the space bar one time. What I meant to write and didn't (since the mind-typing interface was broken) was that this person needs to do something similar to check their "blank" line to see whether it is one space or 10 spaces or whatever it is. Because he/she will have to change their conditional logic accordingly.
cynthia
ps....still hoping for the mind-SAS interface as well as the mind-typing interface.
Ask a Question
Discussion stats
  • 3 replies
  • 420 views
  • 0 likes
  • 3 in conversation