In (example 5, page 12) of the following document Paper 256-2011: Finding Your Way Through the Wilderness
shows how recfm=n can be used to read lines longer than 32,767 characters, byte by byte.
What I don't get is why the record was not trimmed to 32,767 which would be the default lrecl as the values was not explicitly assigned. I believe that my confusion could be caused by misunderstanding of how SAS 'buffering' works while reading files: as I understand it, infile puts data from the file into a "buffer" (which cannot be longer than 32,767 characters) and input puts data from the buffer into SAS variables.
data _null_;
infile 'c:\_today\claude.csv' recfm=n;
file 'c:\_today\claude_.csv' recfm=n;
input a $char1.;
put a $char1.;
if a = ',' then c+1;
if c=3 then do;
d+1;
if d=32767 then put ',';
end;
if c=4 then do;
if d<32767 then put ',';
d=32768;
end;
if c=5 then do;
d=0;
c=0;
end;
run;
I don't really know the details of how SAS reads/buffers data, but I suspect it is a little more complex than your description. It is probably reading/buffering the data from the disk and then staging/parsing that for the data step.
There is a maximum record length for when SAS is parsing the data into records for you, but it is not 32k. That is just the default.
But when you use RECFM=N then it is NOT parsing the data into records for you. It is just giving you the data.
If you use RECFM=F then it splits the records as fixed length. If you use RECFM=V then it looks for end of record character(s) (see TERMSTR= option) to parse the data into records.
recfm=n means that the input is read as a stream (there are no records), and for each input the required number of bytes is read; the input buffer has an effective length of 1.
I don't really know the details of how SAS reads/buffers data, but I suspect it is a little more complex than your description. It is probably reading/buffering the data from the disk and then staging/parsing that for the data step.
There is a maximum record length for when SAS is parsing the data into records for you, but it is not 32k. That is just the default.
But when you use RECFM=N then it is NOT parsing the data into records for you. It is just giving you the data.
If you use RECFM=F then it splits the records as fixed length. If you use RECFM=V then it looks for end of record character(s) (see TERMSTR= option) to parse the data into records.
I was wrong in a certain way, as with recfm=n SAS does not have an input buffer at all; _infile_ does not work with recfm=n, and SAS simply reads as many bytes as requested from the input stream.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.