CSV file contains embedded carriage return (CR) and linefeed characters (LF) inside double quotes.
Please guide me how to handle these fields in DI Studio 3.4
Thanks in advance.
Hi Syn,
Just check if your CSV file is corrupted,if not just get original CSV file and try...
Are you converting your files to CSV format from different format?
Thanks,
Shiva
CSV file contains embedded carriage return (CR) and linefeed characters (LF) inside double quotes.
Please guide me how to handle these fields in DI Studio 3.4
Thanks in advance.
I have no idea about DI Studio but here are some things to test using SAS.
Check that whether it has embedded CRLF or just one or the other. On the INFILE statement you can tell SAS to treat CRLF as the end of line marker (standard used by Windows) using the TERMSTR=CRLF option. Then embedded CR or LF characters that do not bump up next to each other to form a CRLF pair will be read.
If you still have trouble then convert the file using a SAS data step or other programming language.
Read the file as bytes and keep your own counter so the program can tell when it is inside a quoted string and then replace existing CRLF with something else.
Tom,
You've replied about termstr=CRLF many times. I know, becuase I just searched for "embedded linefeed" (a problem I needed help with) and found them. It solved my problem!
You could implement a user written transformation which has as input your external file, removes all CR and LF which are inside of quotes and then writes the result to a cleaned new .csv. You then use the file reader on this cleaned version.
You could use a global macro variable as part of the filename of the external file and you then change the value of this macro variable inside the user written transformation (e.g. from myfile.csv to myfile_clean.csv). This way you can use the same external file metadata for both .csv versions.
As for Tom's suggestion: I believe you have to go with the second approach ("read the file as bytes") as TERMSTR=CRLF has only been introduced with SAS9.2
DIS 3.4 means you are still using SAS 9.1.3
Make sure you test the effect of termstr thoroughly as according to this note it's not properly working with 9.1.3 37159 - TERMSTR does not work correctly prior to SAS 9.2
...but may be there has been a patch and the note is outdated.
It's a bit of a hack but what worked for me in the past for removing unwanted characters from the input buffer before mapping variables against it, is to "inject" code as below:
In the external file definition add the following (the ';' at the beginning terminates the INFILE statement):
Using the File Reader you then get code as below:
data <table>;
infile '<external file>'
lrecl = 1000
delimiter = ','
dsd
missover
firstobs = 6
;
input @;
_infile_=translate(_infile_,' ','0A'x,' ','0D'x);
;
;
<attrib statements>
input <list of input variables mapped agains cleansed input buffer>;
run;
Or you can use recfm= option to let SAS take this csv file as a stream file( only has one row) .
infile '.......' recfm=n ;
Xia Keshan
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.
