In (example 5, page 12) of the following document Paper 256-2011: Finding Your Way Through the Wilderness
shows how recfm=n can be used to read lines longer than 32,767 characters, byte by byte.
What I don't get is why the record was not trimmed to 32,767 which would be the default lrecl as the values was not explicitly assigned. I believe that my confusion could be caused by misunderstanding of how SAS 'buffering' works while reading files: as I understand it, infile puts data from the file into a "buffer" (which cannot be longer than 32,767 characters) and input puts data from the buffer into SAS variables.
data _null_;
infile 'c:\_today\claude.csv' recfm=n;
file 'c:\_today\claude_.csv' recfm=n;
input a $char1.;
put a $char1.;
if a = ',' then c+1;
if c=3 then do;
d+1;
if d=32767 then put ',';
end;
if c=4 then do;
if d<32767 then put ',';
d=32768;
end;
if c=5 then do;
d=0;
c=0;
end;
run;
I don't really know the details of how SAS reads/buffers data, but I suspect it is a little more complex than your description. It is probably reading/buffering the data from the disk and then staging/parsing that for the data step.
There is a maximum record length for when SAS is parsing the data into records for you, but it is not 32k. That is just the default.
But when you use RECFM=N then it is NOT parsing the data into records for you. It is just giving you the data.
If you use RECFM=F then it splits the records as fixed length. If you use RECFM=V then it looks for end of record character(s) (see TERMSTR= option) to parse the data into records.
recfm=n means that the input is read as a stream (there are no records), and for each input the required number of bytes is read; the input buffer has an effective length of 1.
I don't really know the details of how SAS reads/buffers data, but I suspect it is a little more complex than your description. It is probably reading/buffering the data from the disk and then staging/parsing that for the data step.
There is a maximum record length for when SAS is parsing the data into records for you, but it is not 32k. That is just the default.
But when you use RECFM=N then it is NOT parsing the data into records for you. It is just giving you the data.
If you use RECFM=F then it splits the records as fixed length. If you use RECFM=V then it looks for end of record character(s) (see TERMSTR= option) to parse the data into records.
I was wrong in a certain way, as with recfm=n SAS does not have an input buffer at all; _infile_ does not work with recfm=n, and SAS simply reads as many bytes as requested from the input stream.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.