<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Extract ID's separated by dashes from text string in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Extract-ID-s-separated-by-dashes-from-text-string/m-p/72862#M15709</link>
    <description>Hi:&lt;BR /&gt;
  There is an alternate solution -- but it requires the following to be true:&lt;BR /&gt;
1) there is always 1 space between VAR1 (ID), VAR2 and VAR3 and&lt;BR /&gt;
2) the last piece of ID -ALWAYS- ends in a number&lt;BR /&gt;
 &lt;BR /&gt;
  If these are true, then you can parse out ID when you first read the input file, as shown in the code below, using the ANYDIGIT function.&lt;BR /&gt;
 &lt;BR /&gt;
cynthia&lt;BR /&gt;
[pre]&lt;BR /&gt;
data readdash;&lt;BR /&gt;
length var1 $35 var2 8 var3 $32;&lt;BR /&gt;
   ** This infile/input combination assumes that there is ALWAYS;&lt;BR /&gt;
   ** at least 1 space between each variable in the input file.;&lt;BR /&gt;
   infile datalines dlm=' ' ;&lt;BR /&gt;
   input Var1 &amp;amp;$ Var2:mmddyy10. Var3 &amp;amp;$32.;&lt;BR /&gt;
   format var2 mmddyy10.;&lt;BR /&gt;
                   &lt;BR /&gt;
   ** As long as ID -always- ends in a number,;&lt;BR /&gt;
   ** search backwards from the end of VAR1 and find the last digit;&lt;BR /&gt;
   ** in the VAR1 variable. With ANYDIGIT, if start is negative, the;&lt;BR /&gt;
   ** search starts at the END of the string and goes to the right.;&lt;BR /&gt;
   ** So the value of GOTNUM is the position of the digit before the ----.;&lt;BR /&gt;
   lg = length(var1)*-1;&lt;BR /&gt;
   gotnum = anydigit(var1,lg);&lt;BR /&gt;
                                             &lt;BR /&gt;
   ** could obs 3 happen in the data??;&lt;BR /&gt;
   if gotnum gt 0 then id = substr(var1,1,gotnum);&lt;BR /&gt;
   else if gotnum = 0 then id = var1;&lt;BR /&gt;
return;&lt;BR /&gt;
datalines;&lt;BR /&gt;
AB6772-CD5885-ABCD6194-XY7199----  09/14/2010 London&lt;BR /&gt;
PQ5928-SP352-BZ5752-----  09/16/2010 New York&lt;BR /&gt;
ZZzzzz-SPeee-YYaaaa-----  09/16/2010 Albuquerque&lt;BR /&gt;
;;;;&lt;BR /&gt;
run;&lt;BR /&gt;
                     &lt;BR /&gt;
proc print data=readdash;&lt;BR /&gt;
run;&lt;BR /&gt;
[/pre]</description>
    <pubDate>Fri, 17 Sep 2010 18:37:03 GMT</pubDate>
    <dc:creator>Cynthia_sas</dc:creator>
    <dc:date>2010-09-17T18:37:03Z</dc:date>
    <item>
      <title>Extract ID's separated by dashes from text string</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Extract-ID-s-separated-by-dashes-from-text-string/m-p/72859#M15706</link>
      <description>I have a dataset that contains the following sample observations.  Var1 has ID's separated by dashes.  Var2 is date and Var3 is location.&lt;BR /&gt;
&lt;BR /&gt;
Var1                                                       Var2                Var3&lt;BR /&gt;
AB6772-CD5885-ABCD6194-XY7199----     09/14/2010       London&lt;BR /&gt;
PQ5928-SP352-BZ5752-----                      09/16/2010       New York&lt;BR /&gt;
&lt;BR /&gt;
Each string in Var1 has from 1 to 8 ID's.  In this example, there are 4 ID's in the first obs and 3 in the second.  The dashes at the end of the string mean that that there are no more ID's for that obs.&lt;BR /&gt;
&lt;BR /&gt;
I would like to extract each ID and transpose the file as follows:&lt;BR /&gt;
&lt;BR /&gt;
ID	  Var2	                Var3&lt;BR /&gt;
AB6772	  09/14/2010	London&lt;BR /&gt;
CD5885	  09/14/2010	London&lt;BR /&gt;
ABCD6194  09/14/2010	London&lt;BR /&gt;
XY7199	  09/14/2010	London&lt;BR /&gt;
PQ5928	  09/16/2010	New York&lt;BR /&gt;
SP352	  09/16/2010	New York&lt;BR /&gt;
BZ5752	  09/16/2010	New York&lt;BR /&gt;
&lt;BR /&gt;
I am looking for a simple way to achieve this.  Any suggestions?</description>
      <pubDate>Thu, 16 Sep 2010 19:50:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Extract-ID-s-separated-by-dashes-from-text-string/m-p/72859#M15706</guid>
      <dc:creator>hkassam</dc:creator>
      <dc:date>2010-09-16T19:50:46Z</dc:date>
    </item>
    <item>
      <title>Re: Extract ID's separated by dashes from text string</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Extract-ID-s-separated-by-dashes-from-text-string/m-p/72860#M15707</link>
      <description>Does this look right?&lt;BR /&gt;
&lt;BR /&gt;
[pre]&lt;BR /&gt;
data need;&lt;BR /&gt;
   input Var1 &amp;amp;$64. Var2:mmddyy. Var3 &amp;amp;$32.;&lt;BR /&gt;
   format var2 mmddyy.;&lt;BR /&gt;
   cards;&lt;BR /&gt;
AB6772-CD5885-ABCD6194-XY7199----  09/14/2010 London&lt;BR /&gt;
PQ5928-SP352-BZ5752-----  09/16/2010 New York&lt;BR /&gt;
;;;;&lt;BR /&gt;
   run;&lt;BR /&gt;
data need;&lt;BR /&gt;
   length id $16;&lt;BR /&gt;
   set need;&lt;BR /&gt;
   do i = 1 to 8;&lt;BR /&gt;
      id = scan(var1,i,'-','O');&lt;BR /&gt;
      if missing(id) then leave;&lt;BR /&gt;
      output;         &lt;BR /&gt;
      end;&lt;BR /&gt;
   drop var1 i;&lt;BR /&gt;
   run;&lt;BR /&gt;
proc print;&lt;BR /&gt;
   run;&lt;BR /&gt;
proc contents;&lt;BR /&gt;
   run; &lt;BR /&gt;
[/pre]</description>
      <pubDate>Thu, 16 Sep 2010 20:11:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Extract-ID-s-separated-by-dashes-from-text-string/m-p/72860#M15707</guid>
      <dc:creator>data_null__</dc:creator>
      <dc:date>2010-09-16T20:11:17Z</dc:date>
    </item>
    <item>
      <title>Re: Extract ID's separated by dashes from text string</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Extract-ID-s-separated-by-dashes-from-text-string/m-p/72861#M15708</link>
      <description>Yes, it works like a charm.  I inserted the do loop in my data step reading the original dataset and it did the job.&lt;BR /&gt;
&lt;BR /&gt;
I did not realize the scan function is able to keep count of each word between the delimiters.  &lt;BR /&gt;
&lt;BR /&gt;
The irony is that I myself created the text string with the dash delimiters in the first place but now want to unravel it for further processing.&lt;BR /&gt;
&lt;BR /&gt;
Thanks a millon.</description>
      <pubDate>Thu, 16 Sep 2010 20:32:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Extract-ID-s-separated-by-dashes-from-text-string/m-p/72861#M15708</guid>
      <dc:creator>hkassam</dc:creator>
      <dc:date>2010-09-16T20:32:22Z</dc:date>
    </item>
    <item>
      <title>Re: Extract ID's separated by dashes from text string</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Extract-ID-s-separated-by-dashes-from-text-string/m-p/72862#M15709</link>
      <description>Hi:&lt;BR /&gt;
  There is an alternate solution -- but it requires the following to be true:&lt;BR /&gt;
1) there is always 1 space between VAR1 (ID), VAR2 and VAR3 and&lt;BR /&gt;
2) the last piece of ID -ALWAYS- ends in a number&lt;BR /&gt;
 &lt;BR /&gt;
  If these are true, then you can parse out ID when you first read the input file, as shown in the code below, using the ANYDIGIT function.&lt;BR /&gt;
 &lt;BR /&gt;
cynthia&lt;BR /&gt;
[pre]&lt;BR /&gt;
data readdash;&lt;BR /&gt;
length var1 $35 var2 8 var3 $32;&lt;BR /&gt;
   ** This infile/input combination assumes that there is ALWAYS;&lt;BR /&gt;
   ** at least 1 space between each variable in the input file.;&lt;BR /&gt;
   infile datalines dlm=' ' ;&lt;BR /&gt;
   input Var1 &amp;amp;$ Var2:mmddyy10. Var3 &amp;amp;$32.;&lt;BR /&gt;
   format var2 mmddyy10.;&lt;BR /&gt;
                   &lt;BR /&gt;
   ** As long as ID -always- ends in a number,;&lt;BR /&gt;
   ** search backwards from the end of VAR1 and find the last digit;&lt;BR /&gt;
   ** in the VAR1 variable. With ANYDIGIT, if start is negative, the;&lt;BR /&gt;
   ** search starts at the END of the string and goes to the right.;&lt;BR /&gt;
   ** So the value of GOTNUM is the position of the digit before the ----.;&lt;BR /&gt;
   lg = length(var1)*-1;&lt;BR /&gt;
   gotnum = anydigit(var1,lg);&lt;BR /&gt;
                                             &lt;BR /&gt;
   ** could obs 3 happen in the data??;&lt;BR /&gt;
   if gotnum gt 0 then id = substr(var1,1,gotnum);&lt;BR /&gt;
   else if gotnum = 0 then id = var1;&lt;BR /&gt;
return;&lt;BR /&gt;
datalines;&lt;BR /&gt;
AB6772-CD5885-ABCD6194-XY7199----  09/14/2010 London&lt;BR /&gt;
PQ5928-SP352-BZ5752-----  09/16/2010 New York&lt;BR /&gt;
ZZzzzz-SPeee-YYaaaa-----  09/16/2010 Albuquerque&lt;BR /&gt;
;;;;&lt;BR /&gt;
run;&lt;BR /&gt;
                     &lt;BR /&gt;
proc print data=readdash;&lt;BR /&gt;
run;&lt;BR /&gt;
[/pre]</description>
      <pubDate>Fri, 17 Sep 2010 18:37:03 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Extract-ID-s-separated-by-dashes-from-text-string/m-p/72862#M15709</guid>
      <dc:creator>Cynthia_sas</dc:creator>
      <dc:date>2010-09-17T18:37:03Z</dc:date>
    </item>
    <item>
      <title>Re: Extract ID's separated by dashes from text string</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Extract-ID-s-separated-by-dashes-from-text-string/m-p/72863#M15710</link>
      <description>pick up on Cynthia's solution and make the infile parsing work a little harder for you &lt;span class="lia-unicode-emoji" title=":winking_face:"&gt;😉&lt;/span&gt;&lt;BR /&gt;
The DLM= option of the INFILE statement can accept a variable name as well as a constant. Each time an INPUT statement is executed from the file defined by that INFILE, the value of the DLM= variable is re-evaluated to prepare for parsing. So, for reading the ID variables, set the delimiter to dash (-) and then set it to blank to read the remains of the line. Once the whole line is read, loop over the  ID variables to release an observation for each non-blank ID :[pre]data  reduced( keep= ID Var2 Var3 ) ;&lt;BR /&gt;
   length id i1-i8 $11 Var2 6 Var3 $40 delimiter $1 ;&lt;BR /&gt;
   attrib var2 format= mmddyy10. informat= mmddyy10. ;&lt;BR /&gt;
   infile cards dsd dlm=delimiter  column=c ;&lt;BR /&gt;
   delimiter = '-' ;&lt;BR /&gt;
   input i1-i8 @;&lt;BR /&gt;
   delimiter = ' ' ;&lt;BR /&gt;
   input +1 /* pass over the first blank*/&lt;BR /&gt;
         Var2 @ ;&lt;BR /&gt;
   Var3 = substr( _infile_, c ) ; *put rest of line into Var3 ;&lt;BR /&gt;
   array   i(8) ;&lt;BR /&gt;
   do c=1 to 8 ;&lt;BR /&gt;
      if   i(c) =' ' then continue ;&lt;BR /&gt;
      id = i(c) ;&lt;BR /&gt;
      output ;&lt;BR /&gt;
   end ;&lt;BR /&gt;
list;cards;&lt;BR /&gt;
AB6772-CD5885-ABCD6194-XY7199---- 09/14/2010 London&lt;BR /&gt;
PQ5928-SP352-BZ5752----- 09/16/2010 New York&lt;BR /&gt;
;[/pre]&lt;BR /&gt;
great things available in data step INFILE and INPUT parsing&lt;BR /&gt;
  &lt;BR /&gt;
enjoy&lt;BR /&gt;
peterC</description>
      <pubDate>Sun, 19 Sep 2010 16:51:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Extract-ID-s-separated-by-dashes-from-text-string/m-p/72863#M15710</guid>
      <dc:creator>Peter_C</dc:creator>
      <dc:date>2010-09-19T16:51:38Z</dc:date>
    </item>
  </channel>
</rss>

