<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Substring by multiple delimiters in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Substring-by-multiple-delimiters/m-p/583693#M166179</link>
    <description>&lt;P&gt;Provide more details about what you want out.&amp;nbsp; If you just want to location the position of a string like '--' then use the INDEX() function.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;var1='jkdfldfd-abc--123';
loc=index(var1,'--');
if loc then var2=substr(var1,1,loc-1);&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;116  data _null_;
117    length var1 var1 $100;
118    var1='jkdfldfd-abc--123';
119    loc=index(var1,'--');
120    if loc then var2=substr(var1,1,loc-1);
121    put (_all_) (=);
122  run;

var1=jkdfldfd-abc--123 loc=13 var2=jkdfldfd-abc
&lt;/PRE&gt;</description>
    <pubDate>Sat, 24 Aug 2019 17:23:56 GMT</pubDate>
    <dc:creator>Tom</dc:creator>
    <dc:date>2019-08-24T17:23:56Z</dc:date>
    <item>
      <title>Substring by multiple delimiters</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Substring-by-multiple-delimiters/m-p/583692#M166178</link>
      <description>&lt;P&gt;Hello, here is my first post in the community hoping to get some help. I have a large SAS dataset with a character field that i'd like to clean up by using a set of 2 delimiters.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;for instance, my data has a string variable with observations like:&lt;/P&gt;&lt;P&gt;jkdfldfd-abc--123&lt;/P&gt;&lt;P&gt;dkjfds789-cdegfd--abc1&lt;/P&gt;&lt;P&gt;asddkj--dmco-hwrfd&lt;/P&gt;&lt;P&gt;....&lt;/P&gt;&lt;P&gt;....&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am trying to find a way to create a new variable that cuts off at the double '--' dash delimiter and not at the single '-' dash. I read somewhere that Scan function does not support multiple delimiters and some have suggested alternatives using infile statement. The infile approach is not efficient in my case since the data is already in SAS format and as I said it's a large dataset with many fields.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Is there another alternative that you can suggest?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Many thanks in advance for your help!&lt;/P&gt;</description>
      <pubDate>Sat, 24 Aug 2019 17:13:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Substring-by-multiple-delimiters/m-p/583692#M166178</guid>
      <dc:creator>goyalrk</dc:creator>
      <dc:date>2019-08-24T17:13:56Z</dc:date>
    </item>
    <item>
      <title>Re: Substring by multiple delimiters</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Substring-by-multiple-delimiters/m-p/583693#M166179</link>
      <description>&lt;P&gt;Provide more details about what you want out.&amp;nbsp; If you just want to location the position of a string like '--' then use the INDEX() function.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;var1='jkdfldfd-abc--123';
loc=index(var1,'--');
if loc then var2=substr(var1,1,loc-1);&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;116  data _null_;
117    length var1 var1 $100;
118    var1='jkdfldfd-abc--123';
119    loc=index(var1,'--');
120    if loc then var2=substr(var1,1,loc-1);
121    put (_all_) (=);
122  run;

var1=jkdfldfd-abc--123 loc=13 var2=jkdfldfd-abc
&lt;/PRE&gt;</description>
      <pubDate>Sat, 24 Aug 2019 17:23:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Substring-by-multiple-delimiters/m-p/583693#M166179</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2019-08-24T17:23:56Z</dc:date>
    </item>
    <item>
      <title>Re: Substring by multiple delimiters</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Substring-by-multiple-delimiters/m-p/583696#M166182</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/286452"&gt;@goyalrk&lt;/a&gt;:&lt;/P&gt;
&lt;P&gt;There are many ways of doing this. One would be:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have ;                                     
  input str $char22. ;                          
  cards ;                                       
jkdfldfd-abc--123                               
dkjfds789-cdegfd--abc1                          
asddkj--dmco-hwrfd                              
run ;                                           
                                                
data want ;                                     
  set have ;                                    
  str1 = substr (str, 1, find (str, "--") - 1) ;
  str2 = substr (str, find (str, "--") + 2) ;   
run ;                                           
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Using SUBSTR means that STR1 and STR2 will assume the same system length as STR.&lt;/P&gt;
&lt;P&gt;Another is to translate "--" into a single character other than a dash and use the SCAN function:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want ;                                 
  set have ;                                
  str1 = str ;                              
  str2 = str ;                              
  str1 = scan (tranwrd (str, "--", ""), 1) ;
  str2 = scan (tranwrd (str, "--", ""), 2) ;
run ;                                       
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Here STR1 and STR2 are first assigned STR to make them assume its system length; otherwise SCAN would make both $200 by default.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Kind regards&lt;/P&gt;
&lt;P&gt;Paul D.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 24 Aug 2019 17:46:49 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Substring-by-multiple-delimiters/m-p/583696#M166182</guid>
      <dc:creator>hashman</dc:creator>
      <dc:date>2019-08-24T17:46:49Z</dc:date>
    </item>
    <item>
      <title>Re: Substring by multiple delimiters</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Substring-by-multiple-delimiters/m-p/583703#M166186</link>
      <description>&lt;P&gt;Thanks for your response. My goal here is to identify all the characters up until there are no two or more consecutive delimiters.&amp;nbsp;&lt;/P&gt;&lt;P&gt;So, for instance, from&amp;nbsp;"&lt;U&gt;adfrgk-dsfgdg--dfgfh&lt;/U&gt;" I want to extract "&lt;U&gt;&lt;EM&gt;adfrgk-dsfgdg&lt;/EM&gt;&lt;/U&gt;"; from "&lt;U&gt;oasdjfiowd-adhf-kladhf----zdfdsfg&lt;/U&gt;" extract "&lt;U&gt;oasdjfiowd-adhf-kladhf&lt;/U&gt;" into a separate variable.&lt;/P&gt;</description>
      <pubDate>Sat, 24 Aug 2019 20:24:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Substring-by-multiple-delimiters/m-p/583703#M166186</guid>
      <dc:creator>goyalrk</dc:creator>
      <dc:date>2019-08-24T20:24:05Z</dc:date>
    </item>
    <item>
      <title>Re: Substring by multiple delimiters</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Substring-by-multiple-delimiters/m-p/583704#M166187</link>
      <description>&lt;P&gt;Thanks&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/21262"&gt;@hashman&lt;/a&gt;!&amp;nbsp;For some reasons, both the alternatives below are returning an error: "invalid third argument". Am I missing something?&lt;/P&gt;</description>
      <pubDate>Sat, 24 Aug 2019 20:26:06 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Substring-by-multiple-delimiters/m-p/583704#M166187</guid>
      <dc:creator>goyalrk</dc:creator>
      <dc:date>2019-08-24T20:26:06Z</dc:date>
    </item>
    <item>
      <title>Re: Substring by multiple delimiters</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Substring-by-multiple-delimiters/m-p/583705#M166188</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/286452"&gt;@goyalrk&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;Thanks for your response. My goal here is to identify all the characters up until there are no two or more consecutive delimiters.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;So, for instance, from&amp;nbsp;"&lt;U&gt;adfrgk-dsfgdg--dfgfh&lt;/U&gt;" I want to extract "&lt;U&gt;&lt;EM&gt;adfrgk-dsfgdg&lt;/EM&gt;&lt;/U&gt;"; from "&lt;U&gt;oasdjfiowd-adhf-kladhf----zdfdsfg&lt;/U&gt;" extract "&lt;U&gt;oasdjfiowd-adhf-kladhf&lt;/U&gt;" into a separate variable.&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;So that is what I posted.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;What do you want to happen when the source string does NOT contain the delimiter string?&lt;/P&gt;
&lt;P&gt;The code I posted does nothing, but you could add an ELSE clause to implement whatever rule you want in that case.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you want the whole string then you could avoid the extra variable and IF/THEN logic by appending '--' so that you are sure you will always find at least one delimiter string.&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;newvar=substr(oldvar||'--',1,index(oldvar||'--','--')-1);&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Sat, 24 Aug 2019 20:31:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Substring-by-multiple-delimiters/m-p/583705#M166188</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2019-08-24T20:31:58Z</dc:date>
    </item>
    <item>
      <title>Re: Substring by multiple delimiters</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Substring-by-multiple-delimiters/m-p/583719#M166191</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/286452"&gt;@goyalrk&lt;/a&gt;:&lt;/P&gt;
&lt;P&gt;The only reason I can see is that in your testing, you have a string with no more than 1 dash in a row anywhere. In this case, the expression:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="courier new,courier" color="#0000FF"&gt;find (str, "--")&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;will return 0, and SUBSTR cannot have it as an argument; hence the error. One way to guard against it - and return the full original string if it has no double dashes anywhere is to use the expression for the third SUBSTR argument:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="courier new,courier" color="#0000FF"&gt;ifn (find (str, "--"), find (str, "--") - 1, length (str))&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Another way is to make &lt;FONT face="courier new,courier" color="#0000FF"&gt;"--" &lt;/FONT&gt;what is termed a sentinel by attaching it to the end of STR:&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="courier new,courier" color="#0000FF"&gt;substr (str||"--", 1, find (str||"--", "--") - 1)&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;as it guarantees that &lt;FONT face="courier new,courier" color="#0000FF"&gt;"--"&lt;/FONT&gt; is always found. In other words (including a string without double dashes in the test in the last record):&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have ;                                                                           
  input str $char22. ;                                                                
  cards ;                                                                             
jkdfldfd-abc----123                                                                   
dkjfds789-cdegfd---abc1                                                               
asddkj--dmco-hwrfd                                                                    
xyzz-abcd-456                                                                         
run ;                                                                                 
                                                                                      
data want ;                                                                           
  set have ;                                                                          
  str1 = substr (str, 1, ifn (find (str, "--"), find (str, "--") - 1, length (str))) ;
* str1 = substr (str||"--", 1, find (str||"--", "--") - 1) ; *same results;            
run ;                                                                                 
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Kind regards&lt;/P&gt;
&lt;P&gt;Paul D.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;&lt;BR /&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 24 Aug 2019 23:37:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Substring-by-multiple-delimiters/m-p/583719#M166191</guid>
      <dc:creator>hashman</dc:creator>
      <dc:date>2019-08-24T23:37:38Z</dc:date>
    </item>
    <item>
      <title>Re: Substring by multiple delimiters</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Substring-by-multiple-delimiters/m-p/583981#M166284</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/286452"&gt;@goyalrk&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;Thanks&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/21262"&gt;@hashman&lt;/a&gt;!&amp;nbsp;For some reasons, both the alternatives below are returning an error: "invalid third argument". Am I missing something?&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Post the code and comments from the log. Paste into a code box opened using the forum's {I} or "running man" icon.&lt;/P&gt;
&lt;P&gt;We can't see what actual code you submit without log. And the messages often include diagnostics that you don't mention.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 26 Aug 2019 16:20:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Substring-by-multiple-delimiters/m-p/583981#M166284</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2019-08-26T16:20:13Z</dc:date>
    </item>
  </channel>
</rss>

