<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: prxparse and prxmatch when part of string is conditional on the value of a separate variable in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/prxparse-and-prxmatch-when-part-of-string-is-conditional-on-the/m-p/645438#M192941</link>
    <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/37814"&gt;@Walternate&lt;/a&gt;,&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/37814"&gt;@Walternate&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;1. Can I incorporate a numeric range spanning more than one character? For example, if the range was 399-401, could I do [399-401], or is that not allowed? I tried using test data and it didn't give an error but was not able to find matches, but maybe I'm missing something about the right way to set it up.&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;No, you cannot specify numeric ranges like this. Perl regular expressions focus on &lt;EM&gt;characters&lt;/EM&gt;. Ranges like [2-7] rely on the corresponding ranges of ASCII (or maybe EBCDIC) codes. For example, [Y-b] is a valid range on an ASCII system and includes (in addition to 'Y', 'Z', 'a' and 'b') the six special characters between 'Z' and 'a' in the ASCII collating sequence, e.g., the underscore. However, [b-Y] would be invalid since 'b'&amp;gt;'Y'. The regex [314-618] would be interpreted as "'3'&amp;nbsp;&lt;EM&gt;or&lt;/EM&gt;&amp;nbsp;'1'&amp;nbsp;&lt;EM&gt;or&lt;/EM&gt; something in the (character!) range '4'-'6' (i.e., '4', '5' or '6') &lt;EM&gt;or&lt;/EM&gt;&amp;nbsp;'1' [redundant] &lt;EM&gt;or&lt;/EM&gt;&amp;nbsp;'8'." Your example [399-401] contains an invalid range (9-4) and I'm surprised to read that you didn't get error messages including "&lt;FONT face="courier new,courier"&gt;ERROR: Invalid [] range "9-4"&lt;/FONT&gt; ..." In most cases it would be cumbersome to construct a regex matching a range of integers.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/37814"&gt;@Walternate&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;2. Adding even more complexity: What if I have two numeric ranges based on the values of a separate variable?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I can think of the kluge-y way to do this:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;if var1 = 'a' then do;&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;/*matching sequence with first range*/&lt;/P&gt;
&lt;P&gt;end;&lt;/P&gt;
&lt;P&gt;else if var1 = 'b' then do;&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;/*matching sequence with second range*/&lt;/P&gt;
&lt;P&gt;end;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Is there a better way, or is that the best I can do?&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;As mentioned above, "numeric ranges" are difficult to implement correctly. I would rather extract the "numeric range" part of the string (e.g., with PRXPOSN or, in simple cases, with SUBSTR), convert it to an integer (INPUT function) and perform a numeric comparison like &lt;FONT face="courier new,courier"&gt;399&amp;lt;=n&amp;lt;=401&lt;/FONT&gt;&amp;nbsp;as part of the matching process. Then it's easy to include a second range: &lt;EM&gt;PRXMATCH criterion&lt;/EM&gt; &amp;amp; (var1='a' &amp;amp; &lt;EM&gt;first range check&lt;/EM&gt; |&amp;nbsp;var1='b' &amp;amp; &lt;EM&gt;second range check&lt;/EM&gt;). Here, the "&lt;EM&gt;PRXMATCH criterion&lt;/EM&gt;" would address the other parts of the string, excluding the "numeric range."&lt;/P&gt;</description>
    <pubDate>Tue, 05 May 2020 22:27:40 GMT</pubDate>
    <dc:creator>FreelanceReinh</dc:creator>
    <dc:date>2020-05-05T22:27:40Z</dc:date>
    <item>
      <title>prxparse and prxmatch when part of string is conditional on the value of a separate variable</title>
      <link>https://communities.sas.com/t5/SAS-Programming/prxparse-and-prxmatch-when-part-of-string-is-conditional-on-the/m-p/645391#M192920</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm using the prxparse and prxmatch functions to check whether the structure of a 20-character string is valid.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Here is a simplified example:&lt;/P&gt;&lt;P&gt;&amp;nbsp;checkit= prxparse("/[A-C][Z][3][1-4][3-5]\d{3}[5][2-7]/i");&lt;BR /&gt;match = (prxmatch(checkit, myvar)=1);&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This works fine, but I have two questions:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;1. Can I incorporate a numeric range spanning more than one character? For example, if the range was 399-401, could I do [399-401], or is that not allowed? I tried using test data and it didn't give an error but was not able to find matches, but maybe I'm missing something about the right way to set it up.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;2. Adding even more complexity: What if I have two numeric ranges based on the values of a separate variable?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I can think of the kluge-y way to do this:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;if var1 = 'a' then do;&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;/*matching sequence with first range*/&lt;/P&gt;&lt;P&gt;end;&lt;/P&gt;&lt;P&gt;else if var1 = 'b' then do;&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;/*matching sequence with second range*/&lt;/P&gt;&lt;P&gt;end;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Is there a better way, or is that the best I can do?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks!&lt;/P&gt;</description>
      <pubDate>Tue, 05 May 2020 19:48:51 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/prxparse-and-prxmatch-when-part-of-string-is-conditional-on-the/m-p/645391#M192920</guid>
      <dc:creator>Walternate</dc:creator>
      <dc:date>2020-05-05T19:48:51Z</dc:date>
    </item>
    <item>
      <title>Re: prxparse and prxmatch when part of string is conditional on the value of a separate variable</title>
      <link>https://communities.sas.com/t5/SAS-Programming/prxparse-and-prxmatch-when-part-of-string-is-conditional-on-the/m-p/645438#M192941</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/37814"&gt;@Walternate&lt;/a&gt;,&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/37814"&gt;@Walternate&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;1. Can I incorporate a numeric range spanning more than one character? For example, if the range was 399-401, could I do [399-401], or is that not allowed? I tried using test data and it didn't give an error but was not able to find matches, but maybe I'm missing something about the right way to set it up.&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;No, you cannot specify numeric ranges like this. Perl regular expressions focus on &lt;EM&gt;characters&lt;/EM&gt;. Ranges like [2-7] rely on the corresponding ranges of ASCII (or maybe EBCDIC) codes. For example, [Y-b] is a valid range on an ASCII system and includes (in addition to 'Y', 'Z', 'a' and 'b') the six special characters between 'Z' and 'a' in the ASCII collating sequence, e.g., the underscore. However, [b-Y] would be invalid since 'b'&amp;gt;'Y'. The regex [314-618] would be interpreted as "'3'&amp;nbsp;&lt;EM&gt;or&lt;/EM&gt;&amp;nbsp;'1'&amp;nbsp;&lt;EM&gt;or&lt;/EM&gt; something in the (character!) range '4'-'6' (i.e., '4', '5' or '6') &lt;EM&gt;or&lt;/EM&gt;&amp;nbsp;'1' [redundant] &lt;EM&gt;or&lt;/EM&gt;&amp;nbsp;'8'." Your example [399-401] contains an invalid range (9-4) and I'm surprised to read that you didn't get error messages including "&lt;FONT face="courier new,courier"&gt;ERROR: Invalid [] range "9-4"&lt;/FONT&gt; ..." In most cases it would be cumbersome to construct a regex matching a range of integers.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/37814"&gt;@Walternate&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;2. Adding even more complexity: What if I have two numeric ranges based on the values of a separate variable?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I can think of the kluge-y way to do this:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;if var1 = 'a' then do;&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;/*matching sequence with first range*/&lt;/P&gt;
&lt;P&gt;end;&lt;/P&gt;
&lt;P&gt;else if var1 = 'b' then do;&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;/*matching sequence with second range*/&lt;/P&gt;
&lt;P&gt;end;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Is there a better way, or is that the best I can do?&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;As mentioned above, "numeric ranges" are difficult to implement correctly. I would rather extract the "numeric range" part of the string (e.g., with PRXPOSN or, in simple cases, with SUBSTR), convert it to an integer (INPUT function) and perform a numeric comparison like &lt;FONT face="courier new,courier"&gt;399&amp;lt;=n&amp;lt;=401&lt;/FONT&gt;&amp;nbsp;as part of the matching process. Then it's easy to include a second range: &lt;EM&gt;PRXMATCH criterion&lt;/EM&gt; &amp;amp; (var1='a' &amp;amp; &lt;EM&gt;first range check&lt;/EM&gt; |&amp;nbsp;var1='b' &amp;amp; &lt;EM&gt;second range check&lt;/EM&gt;). Here, the "&lt;EM&gt;PRXMATCH criterion&lt;/EM&gt;" would address the other parts of the string, excluding the "numeric range."&lt;/P&gt;</description>
      <pubDate>Tue, 05 May 2020 22:27:40 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/prxparse-and-prxmatch-when-part-of-string-is-conditional-on-the/m-p/645438#M192941</guid>
      <dc:creator>FreelanceReinh</dc:creator>
      <dc:date>2020-05-05T22:27:40Z</dc:date>
    </item>
    <item>
      <title>Re: prxparse and prxmatch when part of string is conditional on the value of a separate variable</title>
      <link>https://communities.sas.com/t5/SAS-Programming/prxparse-and-prxmatch-when-part-of-string-is-conditional-on-the/m-p/645475#M192968</link>
      <description>&lt;P&gt;Nothing to add to&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/32733"&gt;@FreelanceReinh&lt;/a&gt;&amp;nbsp;'s reply.&lt;/P&gt;
&lt;P&gt;Note that your expression could be slightly simpler:&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;&amp;nbsp; prxparse("/[A-C]Z3[1-4][3-5]\d{3}5[2-7]/i")&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;For a small range like 399-401 you could search for (399|400|401)&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 05 May 2020 23:35:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/prxparse-and-prxmatch-when-part-of-string-is-conditional-on-the/m-p/645475#M192968</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2020-05-05T23:35:48Z</dc:date>
    </item>
    <item>
      <title>Re: prxparse and prxmatch when part of string is conditional on the value of a separate variable</title>
      <link>https://communities.sas.com/t5/SAS-Programming/prxparse-and-prxmatch-when-part-of-string-is-conditional-on-the/m-p/645611#M193019</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/37814"&gt;@Walternate&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;2. Adding even more complexity: What if I have two numeric ranges based on the values of a separate variable?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Sounds like a job for IF/Then:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If var=&amp;lt;what ever value&amp;gt; (or If var in (&amp;lt;list of values)) then &amp;lt;the code for one search&amp;gt;;&lt;/P&gt;
&lt;P&gt;Else if var in (&amp;lt;other list of values&amp;gt;) then &amp;lt;the code for the other search&amp;gt;;&lt;/P&gt;</description>
      <pubDate>Wed, 06 May 2020 14:29:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/prxparse-and-prxmatch-when-part-of-string-is-conditional-on-the/m-p/645611#M193019</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2020-05-06T14:29:05Z</dc:date>
    </item>
  </channel>
</rss>

