<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: complicated txt file in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/complicated-txt-file/m-p/351666#M81875</link>
    <description>&lt;P&gt;Adapted to your new requirement, assuming the values are digits and spaces:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data test;
if not prxid then 
    prxid + prxparse("/(AN01|BC01|CR4R|DY03)[ 0-9]+/i");
infile datalines truncover;
input @1 custId $9. line $200.;
length item $20;
start = 1;
stop = length(line);
call prxnext(prxid, start, stop, line, pos, len);
do while (pos &amp;gt; 0);
    item = substr(line, pos, len);
    output;
    call prxnext(prxid, start, stop, line, pos, len);
    end;
drop prxid start stop pos len line;
datalines;
CUST12345AN01    98BC01  98BC01  89CR4R 4 5DY03 9
CUST12346AN01    98BC01  98BC01  89 BC017 89CR4R 4 5DY03 9
;

proc print; run;
&lt;/CODE&gt;&lt;/PRE&gt;</description>
    <pubDate>Thu, 20 Apr 2017 13:24:16 GMT</pubDate>
    <dc:creator>PGStats</dc:creator>
    <dc:date>2017-04-20T13:24:16Z</dc:date>
    <item>
      <title>complicated txt file</title>
      <link>https://communities.sas.com/t5/SAS-Programming/complicated-txt-file/m-p/351515#M81810</link>
      <description>&lt;P&gt;Hi folks, thanks for all &amp;nbsp;the great suggestions. I realized that I simplied the problem too much in my original post and I still can’t figure out with your suggestions. So here is problem:&lt;/P&gt;
&lt;P&gt;First of all, there is no delimiters; everything based on positions. One row is for one customer, and it can contains several types of items. Every row starts with Customer ID which take the first 5 characters. There are 4 types of items with similar patterns but different length. &amp;nbsp;Each item start&amp;nbsp; with 4 characters (a combination of letter and number)&lt;/P&gt;
&lt;P&gt;One type starts with "AN01" and has length of&amp;nbsp;10; one type start with "BC01" and has length of 8. &amp;nbsp;&lt;/P&gt;
&lt;P&gt;One type starts with "CR4R" and has length of&amp;nbsp;8; one type start with "DY03" and has length of 6. &amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Each row can have any number of any types. For example 2 “AN01”, followed by 3 “BC01”,&amp;nbsp; then 10 “CR4R”, then 2 “DY03”. Another row can have 3 “AN01”, no “BC01”,&amp;nbsp; then 8 “CR4R”, then 1 “DY03”.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The tricky part is that every type can have space in&amp;nbsp; the value. E.g.&amp;nbsp;‘BC01&amp;nbsp; 89’&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;For example:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;CUST12345AN01&amp;nbsp;&amp;nbsp;&amp;nbsp; 98BC01&amp;nbsp; 98BC01&amp;nbsp; 89CR4R 4 5DY03 9
CUST12346AN01&amp;nbsp;&amp;nbsp;&amp;nbsp; 98BC01&amp;nbsp; 98BC01&amp;nbsp; 89 BC017 89CR4R 4 5DY03 9&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The first line means: 'CUST12345' is the customer ID, with 5 items&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;AN01&amp;nbsp;&amp;nbsp;&amp;nbsp; 98
BC01&amp;nbsp; 98
BC01&amp;nbsp; 89
CR4R 4 5
DY03 9&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I would like the output to have one item per row along with its Customer ID:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;CUST_ID&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ITEM
CUST12345 AN01&amp;nbsp;&amp;nbsp;&amp;nbsp; 98
CUST12345 BC01&amp;nbsp; 98
CUST12345 BC01&amp;nbsp; 89
CUST12345 CR4R 4 5
CUST12345 DY03 9
CUST12346 AN01&amp;nbsp;&amp;nbsp;&amp;nbsp; 98
CUST12346 BC01&amp;nbsp; 98
CUST12346 BC01&amp;nbsp; 89
CUST12346 BC017 89
CUST12346 CR4R 4 5
CUST12346 DY03 9&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Original post:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Hi folks, I had another complicated txt files that I need to read in based on their position. There is no delimiter. One row is one customer, and there are two types of items. Customer ID always take the first 5 characters. One type of item starts with "A" and has length of&amp;nbsp;10; the other type start with "B" and has length of 15. &amp;nbsp;Each customer can have any number of A or B items. The tricky part is that A or B can have space in&amp;nbsp; the value.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;For example:&lt;/P&gt;
&lt;P&gt;CUST12345A10 &amp;nbsp; &amp;nbsp; 98B98 &amp;nbsp; &amp;nbsp;89 &amp;nbsp;1245A20 &amp;nbsp; &amp;nbsp; 98B88 &amp;nbsp; &amp;nbsp;89 &amp;nbsp;1245&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This line means: 'CUST12345' is the customer ID&lt;/P&gt;
&lt;P&gt;'A10 &amp;nbsp; &amp;nbsp; 98' is the value for the first "A" item&lt;/P&gt;
&lt;P&gt;'B98 &amp;nbsp; &amp;nbsp;89 &amp;nbsp;1245' is &amp;nbsp;the value for &amp;nbsp;the first "B" item&lt;/P&gt;
&lt;P&gt;'A20 &amp;nbsp; &amp;nbsp; 98'&amp;nbsp;is the value for the&amp;nbsp;2nd&amp;nbsp;"A" item&lt;/P&gt;
&lt;P&gt;'B88 &amp;nbsp; &amp;nbsp;89 &amp;nbsp;1245'&amp;nbsp;is &amp;nbsp;the value for the 2nd&amp;nbsp;"B" item&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;It will look like:&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;CUST12345 A10 98&lt;/P&gt;
&lt;P&gt;CUST12345&amp;nbsp;A20 98&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The space in “A” and “B” values drive me crazy. Any idea how to create the output?&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 20 Apr 2017 16:35:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/complicated-txt-file/m-p/351515#M81810</guid>
      <dc:creator>yymissing</dc:creator>
      <dc:date>2017-04-20T16:35:46Z</dc:date>
    </item>
    <item>
      <title>Re: complicated txt file</title>
      <link>https://communities.sas.com/t5/SAS-Programming/complicated-txt-file/m-p/351526#M81815</link>
      <description>&lt;P&gt;Regular expressions to the rescue:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data test;
if not prxid then prxid + prxparse("/[AB][^AB]+/i");
infile datalines truncover;
input @5 custId $5. line $200.;
length value $20;
start = 1;
stop = length(line);
call prxnext(prxid, start, stop, line, pos, len);
do while (pos &amp;gt; 0);
    value = substr(line, pos, len);
    output;
    call prxnext(prxid, start, stop, line, pos, len);
    end;
drop prxid start stop pos len line;
datalines;
CUST12345A10     98B98    89  1245A20     98B88    89  1245
;

proc print; run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 20 Apr 2017 03:37:25 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/complicated-txt-file/m-p/351526#M81815</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2017-04-20T03:37:25Z</dc:date>
    </item>
    <item>
      <title>Re: complicated txt file</title>
      <link>https://communities.sas.com/t5/SAS-Programming/complicated-txt-file/m-p/351546#M81823</link>
      <description>&lt;P&gt;Or slightly different:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data WANT;
  if not PRXID then PRXID + prxparse("/A[^AB]+/i");
  infile datalines truncover;
  input ;
  length VALUE $20;
  START = index(_infile_,'A');          
  CUSTID=substr(_infile_,1,START-1);
  call prxnext(PRXID, START, -1, _infile_, POS, LEN);
  do while (POS &amp;gt; 0);       
      VALUE = substr(_infile_, POS, LEN);
      output;
      call prxnext(PRXID, START, -1, _infile_, POS, LEN);
  end;
  keep CUSTID VALUE;
datalines;
CUST12345A10     98B98    89  1245A20     98B88    89  1245
run;

proc print noobs; 
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;TABLE class="table" rules="all" frame="box" cellspacing="0" cellpadding="5" summary="Procedure Print: Data Set WORK.WANT"&gt;
&lt;THEAD&gt;
&lt;TR&gt;
&lt;TH class="l header" scope="col"&gt;VALUE&lt;/TH&gt;
&lt;TH class="l header" scope="col"&gt;CUSTID&lt;/TH&gt;
&lt;/TR&gt;
&lt;/THEAD&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD class="l data"&gt;A10 98&lt;/TD&gt;
&lt;TD class="l data"&gt;CUST12345&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="l data"&gt;A20 98&lt;/TD&gt;
&lt;TD class="l data"&gt;CUST12345&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;</description>
      <pubDate>Thu, 20 Apr 2017 05:15:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/complicated-txt-file/m-p/351546#M81823</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2017-04-20T05:15:09Z</dc:date>
    </item>
    <item>
      <title>Re: complicated txt file</title>
      <link>https://communities.sas.com/t5/SAS-Programming/complicated-txt-file/m-p/351597#M81841</link>
      <description>&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
infile cards truncover;
input cust $9. a $10. b $15. @;
do while(not missing(a));
 a1=substr(a,1,3);
 a2=substr(a,length(a)-2);
 output;
 input a $10. b $15. @;
end;

cards;
CUST12345A10     98B98    89  1245A20     98B88    89  1245 
CUST22345A10     98B98    89  1245 
;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 20 Apr 2017 10:40:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/complicated-txt-file/m-p/351597#M81841</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2017-04-20T10:40:02Z</dc:date>
    </item>
    <item>
      <title>Re: complicated txt file</title>
      <link>https://communities.sas.com/t5/SAS-Programming/complicated-txt-file/m-p/351666#M81875</link>
      <description>&lt;P&gt;Adapted to your new requirement, assuming the values are digits and spaces:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data test;
if not prxid then 
    prxid + prxparse("/(AN01|BC01|CR4R|DY03)[ 0-9]+/i");
infile datalines truncover;
input @1 custId $9. line $200.;
length item $20;
start = 1;
stop = length(line);
call prxnext(prxid, start, stop, line, pos, len);
do while (pos &amp;gt; 0);
    item = substr(line, pos, len);
    output;
    call prxnext(prxid, start, stop, line, pos, len);
    end;
drop prxid start stop pos len line;
datalines;
CUST12345AN01    98BC01  98BC01  89CR4R 4 5DY03 9
CUST12346AN01    98BC01  98BC01  89 BC017 89CR4R 4 5DY03 9
;

proc print; run;
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 20 Apr 2017 13:24:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/complicated-txt-file/m-p/351666#M81875</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2017-04-20T13:24:16Z</dc:date>
    </item>
    <item>
      <title>Re: complicated txt file</title>
      <link>https://communities.sas.com/t5/SAS-Programming/complicated-txt-file/m-p/351799#M81924</link>
      <description>&lt;P&gt;So read ahead then you can calculate how many characters to read based on the beginning of the next item. You can use negative cursor movement values to skip back. You can use $VARYING. informat to read varying length fields.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want ;
  infile cards truncover column = col;
  length cust_id $9 n 8 item $10 ;
  input cust_id $9. item $10. +(-10)  @ ;
  do n=1 by 1 until(item=' ');
    select ;
      when (item=:'AN') len=10;
      when (item=:'BC') len=8;
      when (item=:'CR') len=8;
      when (item=:'DY') len=6;
      otherwise len=0;
    end;
    input item $varying10. len  @;
    if n=1 or item ne ' ' then output;
    input item $10. +(-10) @ ;
  end;
*---+----0----+----0----+----0----+----0----+----0----+----0;
cards;
CUST12345AN01    98BC01  98BC01  89CR4R 4 5DY03 9
CUST12346AN01    98BC01  98BC01  89BC017 89CR4R 4 5DY03 9
;
proc print; run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 20 Apr 2017 17:00:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/complicated-txt-file/m-p/351799#M81924</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2017-04-20T17:00:50Z</dc:date>
    </item>
    <item>
      <title>Re: complicated txt file</title>
      <link>https://communities.sas.com/t5/SAS-Programming/complicated-txt-file/m-p/352056#M82025</link>
      <description>&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
infile cards truncover;
input x $200.;
cards;
CUST12345AN01    98BC01  98BC01  89CR4R 4 5DY03 9
CUST12346AN01    98BC01  98BC01  89 BC017 89CR4R 4 5DY03 9
;
run;
data temp;
 set have;
 pid=prxparse('/[a-z]+[\d\s]+/io');
 s=1;e=length(x);
 call prxnext(pid,s,e,x,p,l);
 do while(p gt 0);
  temp=substr(x,p,l);output;
  call prxnext(pid,s,e,x,p,l);
 end;
 keep temp;
run;

data want;
 set temp;
 length cust $ 40;
 retain cust;
 if upcase(temp) =: 'CUST' then do;
  cust=temp;delete;
 end;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Fri, 21 Apr 2017 04:52:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/complicated-txt-file/m-p/352056#M82025</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2017-04-21T04:52:08Z</dc:date>
    </item>
  </channel>
</rss>

