The raw data file contains a user provided comment fields which, in one case, contains multiple consecutive blanks, and spans 3 input lines.
"This fund is an S&P 500 fund and similar to other approved products on our platform in
terms of fees and performance. The fund is rated 5 Stars with Morningstar. RIM and Adam
Taback approval available along with Morningstar Report in FileNet 7.24.25"
*please note...to the right of the last visible word on the first and second lines (in, Adam) there are many blanks until the end of the input line)
I have specified the variable as having a length, format and informat of 1500. In the INFILE statement I am using FLOWOVER, LRECL=32767, delimiter=',' andDSD.
In spite of all these options, the variable only populates with the first line..."This fund is an S&P 500 fund and similar to other approved products on our platform in"
How can i capture the 2nd and 3rd line until the variable is completely populated. I thought FLOWOVER would handle this specific use case.
In case there are embedded carriage returns or line feeds in a quoted string, consider the following:
https://support.sas.com/kb/26/065.html
If the file is supposed to have one observation per line then do not use FLOWOVER. Use TRUNCOVER instead. And 32K is the DEFAULT length for the LRECL. If the file has lines longer than that use a larger value. SAS should happily support Logical Record Lengths of 2 million bytes or more.
But it looks like the problem is not that the value is too long (however there is 32K byte limit on the length of a character variable in a SAS dataset) but that the data has been split into multiple LINES.
If the file has been created using CR+LF to mark the ends of the lines and the breaks you are seeing are only using just CR or LF characters then you should be able to just use the TERMSTR=CRLF on the INFILE statement to read the file.
infile 'myfile.txt' dsd truncover lrecl=1000000 termstr=crlf;
However if there is no difference between the characters used to for the end of a line and those that have been inserted into the middle of the character value then SAS cannot directly parse that file.
If the values are quoted properly, like your first one then you should be able to pre-process the file and remove (or replace) and end of line character that are inside of quotes. You can detect if they are inside quotes by just counting how many quotes have appeared.
There is a macro available that can do this for you here: https://github.com/sasutils/macros/blob/master/replace_crlf.sas
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.