<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Read a tab delimited text file with fields that contain embedded CR-LF in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Read-a-tab-delimited-text-file-with-fields-that-contain-embedded/m-p/963721#M375406</link>
    <description>&lt;P&gt;In quick search, I found this thread:&lt;BR /&gt;&lt;A href="https://communities.sas.com/t5/SAS-Programming/Input-a-CSV-file-with-text-fields-that-contain-line-breaks/td-p/821603" target="_blank"&gt;https://communities.sas.com/t5/SAS-Programming/Input-a-CSV-file-with-text-fields-that-contain-line-breaks/td-p/821603&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Where&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/159"&gt;@Tom&lt;/a&gt;&amp;nbsp;links to his utility macro to replace CRLF that appear inside quotes, could be a helpful approach for pre-processing the file:&lt;/P&gt;
&lt;P&gt;&lt;A href="https://github.com/sasutils/macros/blob/master/replace_crlf.sas" target="_blank"&gt;https://github.com/sasutils/macros/blob/master/replace_crlf.sas&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Tue, 08 Apr 2025 15:33:24 GMT</pubDate>
    <dc:creator>Quentin</dc:creator>
    <dc:date>2025-04-08T15:33:24Z</dc:date>
    <item>
      <title>Read a tab delimited text file with fields that contain embedded CR-LF</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Read-a-tab-delimited-text-file-with-fields-that-contain-embedded/m-p/963697#M375395</link>
      <description>&lt;P&gt;Consider a tab-delimited text file that has a header row, and the data rows can have fields that contain a CR-LF.&amp;nbsp; Fields (data item) are bounded by double quotes when:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;the data item contains a double quote, in which case the double quote is doubled&lt;/LI&gt;
&lt;LI&gt;the data item contains a CRLF or other control character (TAB, etc.)&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Unfortunately PROC IMPORT will not honor the active double quote bounding when it encounters an embedded CRLF and presume the rows ends at that point.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Data example (tacit CRLF at end of line):&lt;/P&gt;
&lt;PRE&gt;a&amp;lt;TAB&amp;gt;x&amp;lt;TAB&amp;gt;c&lt;BR /&gt;"A&amp;lt;CR&amp;gt;&amp;lt;LF&amp;gt;B"&amp;lt;TAB&amp;gt;3.14&amp;lt;TAB&amp;gt;Okay&lt;BR /&gt;"AAAA&amp;lt;CR&amp;gt;&amp;lt;LF&amp;gt;BCD"&amp;lt;TAB&amp;gt;3.14&amp;lt;TAB&amp;gt;Okay&lt;/PRE&gt;
&lt;PRE&gt;filename data temp ;
data _null_ ;&lt;BR /&gt;  file data ;
  put "61097809630D0A22410D0A422209332E3134094F6B61790D0A"x @@ ;&lt;BR /&gt;  put "22414141410D0A4243442209332E3134094F6B61790D0A"x @@ ;&lt;BR /&gt;  stop ;&lt;BR /&gt;run ;&lt;/PRE&gt;
&lt;P&gt;Proc IMPORT falters at the embedded CRLF&lt;/P&gt;
&lt;LI-CODE lang="sas"&gt;proc import datafile=data dbms=csv replace out=example ;
  guessingrows=max ;
  delimiter='09'x ;
  getnames=yes ;
run ;
&lt;/LI-CODE&gt;
&lt;P&gt;Logs and outputs&lt;/P&gt;
&lt;LI-CODE lang="sas"&gt;Number of names found is greater than number of variables found. 
52             data WORK.EXAMPLE    ;
53             %let _EFIERR_ = 0; /* set the ERROR detection macro variable */
54             infile DATA delimiter='09'x MISSOVER DSD  firstobs=2 ;
55                informat a $12. ;
56                format a $12. ;
57             input
58                         a  $
59             ;
60             if _ERROR_ then call symputx('_EFIERR_',1);  /* set ERROR detection macro variable */
61             run;
5 rows created in WORK.EXAMPLE from DATA.
Obs    a
 1     "A            
 2     B"            
 3     "AAAA         
 4     BCD"          
 5  &lt;/LI-CODE&gt;
&lt;P&gt;DSD is good only for masking embedded delimiters and parsing repeated delimiters as a missing value. It will not mask embedded CRLF.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The entire file as a single record (&lt;A href="https://support.sas.com/resources/papers/proceedings/proceedings/forum2007/027-2007.pdf" target="_self"&gt;"Handling Large Stream Files with the @'string' Feature", Rick Langston&lt;/A&gt;) has some promise but would require bookkeeping to track a 'row' according to number of pieces parsed out aligning with number of pieces in first row.&lt;/P&gt;
&lt;LI-CODE lang="sas"&gt;data _null_ ;
  infile DATA delimiter='09'x DSD lrecl=1000 recfm=f;
  length piece $20 ;
  input piece @@ ;
  putlog _n_= piece= ;
  
  if _n_ = 10 then stop ;
run ;&lt;/LI-CODE&gt;&lt;LI-CODE lang="sas"&gt;_N_=1 piece=a
_N_=2 piece=x
_N_=3 piece=c        &amp;lt;eol CRLF&amp;gt;
"A                   &amp;lt;data item CRLF&amp;gt; 
B"
_N_=4 piece=3.14
_N_=5 piece=Okay     &amp;lt;eol CRLF&amp;gt;
"AAAA                &amp;lt;data item CRLF&amp;gt;
BCD"
_N_=6 piece=3.14
_N_=7 piece=Okay     &amp;lt;eol&amp;gt;
&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Is this approach worthwhile or are there other features or procedure options that can read the file in a simpler way?&lt;/P&gt;</description>
      <pubDate>Tue, 08 Apr 2025 13:38:42 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Read-a-tab-delimited-text-file-with-fields-that-contain-embedded/m-p/963697#M375395</guid>
      <dc:creator>RichardAD</dc:creator>
      <dc:date>2025-04-08T13:38:42Z</dc:date>
    </item>
    <item>
      <title>Re: Read a tab delimited text file with fields that contain embedded CR-LF</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Read-a-tab-delimited-text-file-with-fields-that-contain-embedded/m-p/963721#M375406</link>
      <description>&lt;P&gt;In quick search, I found this thread:&lt;BR /&gt;&lt;A href="https://communities.sas.com/t5/SAS-Programming/Input-a-CSV-file-with-text-fields-that-contain-line-breaks/td-p/821603" target="_blank"&gt;https://communities.sas.com/t5/SAS-Programming/Input-a-CSV-file-with-text-fields-that-contain-line-breaks/td-p/821603&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Where&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/159"&gt;@Tom&lt;/a&gt;&amp;nbsp;links to his utility macro to replace CRLF that appear inside quotes, could be a helpful approach for pre-processing the file:&lt;/P&gt;
&lt;P&gt;&lt;A href="https://github.com/sasutils/macros/blob/master/replace_crlf.sas" target="_blank"&gt;https://github.com/sasutils/macros/blob/master/replace_crlf.sas&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 08 Apr 2025 15:33:24 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Read-a-tab-delimited-text-file-with-fields-that-contain-embedded/m-p/963721#M375406</guid>
      <dc:creator>Quentin</dc:creator>
      <dc:date>2025-04-08T15:33:24Z</dc:date>
    </item>
  </channel>
</rss>

