<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: punctuation cleaning. in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/punctuation-cleaning/m-p/545032#M150742</link>
    <description>The second link I sent has that type of information - this is perl regular expressions.</description>
    <pubDate>Thu, 21 Mar 2019 20:40:14 GMT</pubDate>
    <dc:creator>Reeza</dc:creator>
    <dc:date>2019-03-21T20:40:14Z</dc:date>
    <item>
      <title>punctuation cleaning.</title>
      <link>https://communities.sas.com/t5/SAS-Programming/punctuation-cleaning/m-p/544731#M150654</link>
      <description>&lt;P&gt;Dear all,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I expect to do some punctuation cleaning.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;1. Names beginning or ending with a double quotation mark( i.e., """ %" or "% """) should not contain a space after the beginning quotation mark or before the ending quotation mark respectively.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;2. Names that have a double quotation mark at the beginning and the end, and that do not contain any other double quotation mark(i.e.,&lt;/P&gt;&lt;DIV class="page"&gt;&lt;DIV class="layoutArea"&gt;&lt;DIV class="column"&gt;&lt;P&gt;&lt;SPAN&gt;“””%””” and not “””%””%”””)&lt;/SPAN&gt;, should have quotation marks removed.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;3. Non-alphanumerical characters(i.e., characters except A-Z; 0-9; “””; “@”; “(“; “’”; “#”; “!”; “*”; “/”)&amp;nbsp;at the beginning of a name that are not relevant should be removed.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;4. Non-alphanumerical characters(i.e., characters except A-Z; 0-9; “””; “@”; “(“; “’”; “#”; “!”; “*”; “/”)&amp;nbsp; at the end of a name that are not relevant should be removed.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Could you please give me some suggestions?&lt;/P&gt;&lt;P&gt;thanks in advance.&lt;/P&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Wed, 20 Mar 2019 23:22:47 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/punctuation-cleaning/m-p/544731#M150654</guid>
      <dc:creator>Alexxxxxxx</dc:creator>
      <dc:date>2019-03-20T23:22:47Z</dc:date>
    </item>
    <item>
      <title>Re: punctuation cleaning.</title>
      <link>https://communities.sas.com/t5/SAS-Programming/punctuation-cleaning/m-p/544750#M150661</link>
      <description>&lt;UL&gt;
&lt;LI&gt;CHAR() to get specific characters to check for space or quotation marks&lt;/LI&gt;
&lt;LI&gt;SCAN() to get information between quotes&lt;/LI&gt;
&lt;LI&gt;NOTDIGIT(),&amp;nbsp;NOTALPHA(),&amp;nbsp;NOTALPHANUM() to get types of characters/numbers&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;A href="https://go.documentation.sas.com/?cdcId=pgmsascdc&amp;amp;cdcVersion=9.4_3.4&amp;amp;docsetId=lefunctionsref&amp;amp;docsetTarget=n01f5qrjoh9h4hn1olbdpb5pr2td.htm&amp;amp;locale=en" target="_blank"&gt;https://go.documentation.sas.com/?cdcId=pgmsascdc&amp;amp;cdcVersion=9.4_3.4&amp;amp;docsetId=lefunctionsref&amp;amp;docsetTarget=n01f5qrjoh9h4hn1olbdpb5pr2td.htm&amp;amp;locale=en&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you're feeling adventurous, learning pearl regular expressions is likely your best overall solution for manipulating text data.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You can put some of your test data here and test your regular expressions. I find this helpful for building mine.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;A href="https://regexr.com" target="_blank"&gt;https://regexr.com&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/262815"&gt;@Alexxxxxxx&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;Dear all,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I expect to do some punctuation cleaning.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;1. Names beginning or ending with a double quotation mark( i.e., """ %" or "% """) should not contain a space after the beginning quotation mark or before the ending quotation mark respectively.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;2. Names that have a double quotation mark at the beginning and the end, and that do not contain any other double quotation mark(i.e.,&lt;/P&gt;
&lt;DIV class="page"&gt;
&lt;DIV class="layoutArea"&gt;
&lt;DIV class="column"&gt;
&lt;P&gt;&lt;SPAN&gt;“””%””” and not “””%””%”””)&lt;/SPAN&gt;, should have quotation marks removed.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;3. Non-alphanumerical characters(i.e., characters except A-Z; 0-9; “””; “@”; “(“; “’”; “#”; “!”; “*”; “/”)&amp;nbsp;at the beginning of a name that are not relevant should be removed.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;4. Non-alphanumerical characters(i.e., characters except A-Z; 0-9; “””; “@”; “(“; “’”; “#”; “!”; “*”; “/”)&amp;nbsp; at the end of a name that are not relevant should be removed.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Could you please give me some suggestions?&lt;/P&gt;
&lt;P&gt;thanks in advance.&lt;/P&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 21 Mar 2019 01:20:54 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/punctuation-cleaning/m-p/544750#M150661</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2019-03-21T01:20:54Z</dc:date>
    </item>
    <item>
      <title>Re: punctuation cleaning.</title>
      <link>https://communities.sas.com/t5/SAS-Programming/punctuation-cleaning/m-p/544755#M150664</link>
      <description>&lt;P&gt;or regular expressions.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data HAVE;
  STR='""  a "n "" '; output;
  STR='"  a  n  "  '; output;
  STR='""  a  n "" '; output;
  STR='~~1~~       '; output;
  STR='#~1~#       '; output;
run;
data WANT;
  set HAVE;
 *2; STR=prxchange('s/\A""([^"]*?)""\Z/"$1"/',-1,trim(STR));
 *1; STR=prxchange('s/\A"( *)/"/',-1,trim(STR)); 
 *1; STR=prxchange('s/( *)"\Z/"/',-1,trim(STR));
 *34;STR=prxchange('s/\A[^a-zA-Z0-9"\@()#!* ]*(.*?)[^a-zA-Z0-9"\@()#!* ]*\Z/$1/',-1,trim(STR)); 
  put STR=;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&lt;FONT face="courier new,courier"&gt;STR="" a "n ""&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;STR="a n"&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;STR="a n"&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;STR=1&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;STR=#~1~#&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 21 Mar 2019 01:30:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/punctuation-cleaning/m-p/544755#M150664</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2019-03-21T01:30:13Z</dc:date>
    </item>
    <item>
      <title>Re: punctuation cleaning.</title>
      <link>https://communities.sas.com/t5/SAS-Programming/punctuation-cleaning/m-p/544927#M150712</link>
      <description>Dear Reeza,&lt;BR /&gt;&lt;BR /&gt;thanks for sharing.&lt;BR /&gt;&lt;BR /&gt;do you know where I can learn the meaning of strings like 's/(\w+), (\w+)/$2 $1' which added in prxchange()?</description>
      <pubDate>Thu, 21 Mar 2019 16:18:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/punctuation-cleaning/m-p/544927#M150712</guid>
      <dc:creator>Alexxxxxxx</dc:creator>
      <dc:date>2019-03-21T16:18:08Z</dc:date>
    </item>
    <item>
      <title>Re: punctuation cleaning.</title>
      <link>https://communities.sas.com/t5/SAS-Programming/punctuation-cleaning/m-p/544935#M150716</link>
      <description>Dear @ChriNZ,&lt;BR /&gt;&lt;BR /&gt;thanks for your sharing. Could you please explain the meaning of code in prxchange()?&lt;BR /&gt;Besdies, Could you please let me know where I can learn this code?</description>
      <pubDate>Thu, 21 Mar 2019 16:33:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/punctuation-cleaning/m-p/544935#M150716</guid>
      <dc:creator>Alexxxxxxx</dc:creator>
      <dc:date>2019-03-21T16:33:19Z</dc:date>
    </item>
    <item>
      <title>Re: punctuation cleaning.</title>
      <link>https://communities.sas.com/t5/SAS-Programming/punctuation-cleaning/m-p/545032#M150742</link>
      <description>The second link I sent has that type of information - this is perl regular expressions.</description>
      <pubDate>Thu, 21 Mar 2019 20:40:14 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/punctuation-cleaning/m-p/545032#M150742</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2019-03-21T20:40:14Z</dc:date>
    </item>
    <item>
      <title>Re: punctuation cleaning.</title>
      <link>https://communities.sas.com/t5/SAS-Programming/punctuation-cleaning/m-p/545045#M150746</link>
      <description>&lt;P&gt;There are heaps of tutorials online for pearl regular expressions.&lt;/P&gt;
&lt;P&gt;And validations sites too, like like one:&amp;nbsp;&lt;A href="https://regex101.com/&amp;nbsp;" target="_blank"&gt;https://regex101.com/&amp;nbsp;&lt;/A&gt; whichi I use.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I will explain how your expression is parsing your text.&lt;/P&gt;
&lt;P&gt;My book has a chapter with a dictionary of all SAS-supported expressions (with short examples but no tutorial).&lt;/P&gt;
&lt;P&gt;For example my first expression (I removed the ? as it's not needed)&lt;/P&gt;
&lt;PRE class=" language-sas"&gt;&lt;CODE class="  language-sas"&gt;&lt;SPAN class="token comment"&gt;\A""([^"]*)""\Z&lt;/SPAN&gt;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;reads as:&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="courier new,courier"&gt;\A&amp;nbsp;&lt;/FONT&gt; start of string&lt;/P&gt;
&lt;P&gt;&lt;FONT face="courier new,courier"&gt;""&amp;nbsp;&amp;nbsp;&lt;/FONT&gt;&amp;nbsp; 2 quotes&lt;/P&gt;
&lt;P&gt;&lt;FONT face="courier new,courier"&gt;()&amp;nbsp;&amp;nbsp;&lt;/FONT&gt;&amp;nbsp; a group&lt;/P&gt;
&lt;P&gt;&lt;FONT face="courier new,courier"&gt;[^"]&amp;nbsp;&lt;/FONT&gt;anything but double quotes&lt;/P&gt;
&lt;P&gt;&lt;FONT face="courier new,courier"&gt;*&amp;nbsp; &amp;nbsp;&lt;/FONT&gt;&amp;nbsp; match as many as possible&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp; so all 3 together mean:&lt;/P&gt;
&lt;P&gt;&lt;CODE class="  language-sas"&gt;&lt;SPAN class="token comment"&gt;([^"]*?)&lt;/SPAN&gt;&lt;/CODE&gt; capture all the non-quote characters you can and put them in a group&lt;/P&gt;
&lt;P&gt;&lt;FONT face="courier new,courier"&gt;""&amp;nbsp;&amp;nbsp;&lt;/FONT&gt;&amp;nbsp; 2 quotes&lt;/P&gt;
&lt;P&gt;&lt;FONT face="courier new,courier"&gt;\Z&amp;nbsp;&amp;nbsp;&lt;/FONT&gt;&amp;nbsp;end of string&lt;/P&gt;
&lt;P&gt;So the whole expression reads: match a word starting with 2 quotes, then no quotes,then ending with 2 quotes.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The second part&amp;nbsp;&lt;/P&gt;
&lt;PRE class=" language-sas"&gt;&lt;CODE class="  language-sas"&gt;&lt;SPAN class="token comment"&gt;"$1"&lt;/SPAN&gt;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;replaces everything matched with: a quote then the capture group then a quote.&lt;/P&gt;
&lt;P&gt;I could have written the expression&lt;/P&gt;
&lt;P&gt;&lt;FONT face="courier new,courier"&gt;STR=prxchange('s/\A"("[^"]*")"\Z/$1/',-1,trim(STR));&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;to the same effect&lt;/P&gt;
&lt;P&gt;The very first letter can be &lt;STRONG&gt;m&lt;/STRONG&gt; (often omitted) for matching or &lt;STRONG&gt;s&lt;/STRONG&gt; for substituting.&lt;/P&gt;
&lt;P&gt;It's a steep learning curve, but I taught myself so you can too. &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 21 Mar 2019 20:58:20 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/punctuation-cleaning/m-p/545045#M150746</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2019-03-21T20:58:20Z</dc:date>
    </item>
  </channel>
</rss>

