<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: IMport text data ---bar (|) delimited in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/IMport-text-data-bar-delimited/m-p/698234#M213518</link>
    <description>No, those spaced id-s are so many, that no manual cleaning is possible. See&lt;BR /&gt;id=BC351181-00 then the next id does not start from a new line, etc. all sequential id-s are not started from a line.</description>
    <pubDate>Wed, 11 Nov 2020 21:56:27 GMT</pubDate>
    <dc:creator>Emma8</dc:creator>
    <dc:date>2020-11-11T21:56:27Z</dc:date>
    <item>
      <title>IMport text data ---bar (|) delimited</title>
      <link>https://communities.sas.com/t5/SAS-Programming/IMport-text-data-bar-delimited/m-p/698219#M213508</link>
      <description>&lt;P&gt;I would like to import text (attached example data). Id and text are 2 variables. There are spaces before id, so I could not import successfully into SAS (the actual data have about 5000 rows). Thank you.&lt;/P&gt;</description>
      <pubDate>Wed, 11 Nov 2020 22:51:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/IMport-text-data-bar-delimited/m-p/698219#M213508</guid>
      <dc:creator>Emma8</dc:creator>
      <dc:date>2020-11-11T22:51:50Z</dc:date>
    </item>
    <item>
      <title>Re: IMport text data ---bar (|) delimited</title>
      <link>https://communities.sas.com/t5/SAS-Programming/IMport-text-data-bar-delimited/m-p/698228#M213512</link>
      <description>&lt;P&gt;How did you try to READ the data (you don't need to "IMPORT" a text file)?&lt;/P&gt;
&lt;P&gt;The file you posted as 4 lines of data and one header line.&lt;/P&gt;
&lt;PRE&gt;NOTE: 5 records were read from the infile "c:\downloads\text_w2530_15.txt".
      The minimum record length was 7.
      The maximum record length was 14508.
&lt;/PRE&gt;
&lt;P&gt;That second field is very long for a SAS variable. The maximum length of a character variable in SAS is 32K bytes.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;None of the lines start with spaces.&amp;nbsp;&lt;/STRONG&gt; &lt;STRONG&gt;None of the values of TEXT start with spaces&lt;/STRONG&gt; (other than the one that is totally empty).&amp;nbsp; here are the first 50 characters of each line:&lt;/P&gt;
&lt;PRE&gt;47    data _null_;
48      infile "c:\downloads\text_w2530_15.txt" lrecl=50 ;
49      input;
50      list;
51    run;

NOTE: A byte-order mark in the file "c:\downloads\text_w2530_15.txt" (for fileref "#LN00064") indicates
      that the data is encoded in "utf-8".  This encoding will be used to process the file.
NOTE: The infile "c:\downloads\text_w2530_15.txt" is:
      Filename=c:\downloads\text_w2530_15.txt,
      RECFM=V,LRECL=200,File Size (bytes)=17564,
      Last Modified=11Nov2020:16:00:30,
      Create Time=11Nov2020:16:00:29

RULE:     ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+--
1         ID|Text 7
2         BC51140-00|9/4/15, BILATERAL MAMMOGRAPHY AND LEFT
3         BC151142-00|
4         BC251175-00|9/1/2015 (WHC) CT ANGIO (EVAL SOB)  NO
5         BC351181-00|9/16/2015 (WHC) DX MAMMO/US: 7CM LOBUL
NOTE: 5 records were read from the infile "c:\downloads\text_w2530_15.txt".
      The minimum record length was 7.
      The maximum record length was 50.
      One or more lines were truncated.
&lt;/PRE&gt;</description>
      <pubDate>Wed, 11 Nov 2020 21:16:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/IMport-text-data-bar-delimited/m-p/698228#M213512</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2020-11-11T21:16:28Z</dc:date>
    </item>
    <item>
      <title>Re: IMport text data ---bar (|) delimited</title>
      <link>https://communities.sas.com/t5/SAS-Programming/IMport-text-data-bar-delimited/m-p/698230#M213514</link>
      <description>&lt;P&gt;Something is messed up with the last line in that file.&amp;nbsp; Why is it so much longer than the others? Why does it have multiple pipe characters?&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data check;
  infile "c:\downloads\text_w2530_15.txt" length=ll;
  input;
  line+1;
  line_length=ll;
  length=lengthn(_infile_);
  pipes=countc(_infile_,'|');
run;

proc print;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;PRE&gt;                line_
Obs    line    length    length    pipes

 1       1          7         7       1
 2       2       1012       638       1
 3       3       1013        12       1
 4       4       1013       514       1
 5       5      14508     14024      14
&lt;/PRE&gt;</description>
      <pubDate>Wed, 11 Nov 2020 21:21:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/IMport-text-data-bar-delimited/m-p/698230#M213514</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2020-11-11T21:21:50Z</dc:date>
    </item>
    <item>
      <title>Re: IMport text data ---bar (|) delimited</title>
      <link>https://communities.sas.com/t5/SAS-Programming/IMport-text-data-bar-delimited/m-p/698231#M213515</link>
      <description>I could not by read or import. &lt;BR /&gt;There are about 17 observations:&lt;BR /&gt;first id=BC51140-00&lt;BR /&gt;and the last id=BCdf55403-00&lt;BR /&gt;&lt;BR /&gt;If you scroll right, then you would see how the ID-s have spaced to the right.&lt;BR /&gt;</description>
      <pubDate>Wed, 11 Nov 2020 21:24:03 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/IMport-text-data-bar-delimited/m-p/698231#M213515</guid>
      <dc:creator>Emma8</dc:creator>
      <dc:date>2020-11-11T21:24:03Z</dc:date>
    </item>
    <item>
      <title>Re: IMport text data ---bar (|) delimited</title>
      <link>https://communities.sas.com/t5/SAS-Programming/IMport-text-data-bar-delimited/m-p/698232#M213516</link>
      <description>&lt;P&gt;Where did you get the file?&amp;nbsp; Have you attempted to edit it in any way?&amp;nbsp; If so do you still have the original?&lt;/P&gt;
&lt;P&gt;How was it created?&lt;/P&gt;</description>
      <pubDate>Wed, 11 Nov 2020 21:37:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/IMport-text-data-bar-delimited/m-p/698232#M213516</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2020-11-11T21:37:00Z</dc:date>
    </item>
    <item>
      <title>Re: IMport text data ---bar (|) delimited</title>
      <link>https://communities.sas.com/t5/SAS-Programming/IMport-text-data-bar-delimited/m-p/698234#M213518</link>
      <description>No, those spaced id-s are so many, that no manual cleaning is possible. See&lt;BR /&gt;id=BC351181-00 then the next id does not start from a new line, etc. all sequential id-s are not started from a line.</description>
      <pubDate>Wed, 11 Nov 2020 21:56:27 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/IMport-text-data-bar-delimited/m-p/698234#M213518</guid>
      <dc:creator>Emma8</dc:creator>
      <dc:date>2020-11-11T21:56:27Z</dc:date>
    </item>
    <item>
      <title>Re: IMport text data ---bar (|) delimited</title>
      <link>https://communities.sas.com/t5/SAS-Programming/IMport-text-data-bar-delimited/m-p/698236#M213519</link>
      <description>&lt;P&gt;You really need to find the process that is writing this file and fix that.&lt;/P&gt;
&lt;P&gt;Here is method that appears to work for your example file.&lt;/P&gt;
&lt;P&gt;Basically the first step removes the header line and all line breaks from the original file and inserts line breaks after the pipe characters.&lt;/P&gt;
&lt;P&gt;Then the second step reads the lines and pulls the ID value from the end of the line.&amp;nbsp; It uses the LAG() function to move the id from the end of the previous line to the text on the current line.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;filename copy temp;
data _null_;
  infile "c:\downloads\text_w2530_15.txt" recfm=f lrecl=1 end=eof;
  file copy ;
  * Get rid of "header" line ;
  do until(x='0A'x); input x $char1.; end;
  do while(not eof);
    do until(x='|'); input x $char1.; if x not in ('0A'x,'0D'x) then put x $char1. @; end;
    put;
  end;
run;

data want;
  length nextid id $30 text $1500 ;
  infile copy ;
  input ;
  text = _infile_;
  if substr(text,length(text))='|' then do;
    nextid = scan(scan(text,-1,' '),1,'|');
    text = substr(text,1,length(text)-lengthn(nextid)-2);
  end;
  id = lag(nextid);
  if _n_ &amp;gt; 1 then output;
  drop nextid ;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;You can then easily write it back out as a delimited file than can actually be read.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data _null_;
  file 'c:\downloads\fixed.txt' dsd dlm='|';
  set want;
  put id text;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;So you appear to have 17 "lines" in that original file.&amp;nbsp; Just so you can see it here are the lines truncated to just 50 bytes.&lt;/P&gt;
&lt;PRE&gt;337   data _null_;
338     infile  'c:\downloads\fixed.txt' truncover;
339     input line $char50.;
340     put line $char50.;
341   run;

NOTE: The infile 'c:\downloads\fixed.txt' is:
      Filename=c:\downloads\fixed.txt,
      RECFM=V,LRECL=32767,File Size (bytes)=5562,
      Last Modified=11Nov2020:17:20:57,
      Create Time=11Nov2020:17:20:57

BC51140-00|9/4/15, BILATERAL MAMMOGRAPHY AND LEFT
BC151142-00|
BC251175-00|9/1/2015 (WHC) CT ANGIO (EVAL SOB)  NO
BC351181-00|9/16/2015 (WHC) DX MAMMO/US: 7CM LOBUL
BC751182-00|7/22/15, MRI: MASS WITHIN SEGMENT VII
BC7851184-00|
BCG51185-00|08/05/15 PERIPHERAL BLOOD SMEAR: CHRON
BCT51186-00|SKELETAL SURVEY 8/13/15: DIFFUSE OSTEO
BCTY51187-00|05/27/2015-UNK LOCATION - FLEXIBLE SI
BC0951188-00|- 09/14/2015-WHC - MRI RT UPPER EXTRM
BC56651190-00|
BC56751192-00|CT A/P 9/17/15 -  ENDOMETRIAL MASS W
BC56hu51196-00|09/04/2015-WHC-CT ABD/PELVIS-2.5CM
BC0955237-02|NONE
BCV55391-00|03/23/2015 (SMH) CT ABDOMEN/PELVIS: PA
BCtgy55402-00|02/12/2015: LEFT BREAST US- 1.5 CM I
BCdf55403-00|IMPRESSION 1. RESPONSE TO TREATMENT W
NOTE: 17 records were read from the infile 'c:\downloads\fixed.txt'.
      The minimum record length was 12.
      The maximum record length was 915.&lt;/PRE&gt;</description>
      <pubDate>Wed, 11 Nov 2020 22:28:23 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/IMport-text-data-bar-delimited/m-p/698236#M213519</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2020-11-11T22:28:23Z</dc:date>
    </item>
    <item>
      <title>Re: IMport text data ---bar (|) delimited</title>
      <link>https://communities.sas.com/t5/SAS-Programming/IMport-text-data-bar-delimited/m-p/698243#M213522</link>
      <description>Thank you so much!</description>
      <pubDate>Wed, 11 Nov 2020 22:47:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/IMport-text-data-bar-delimited/m-p/698243#M213522</guid>
      <dc:creator>Emma8</dc:creator>
      <dc:date>2020-11-11T22:47:33Z</dc:date>
    </item>
  </channel>
</rss>

