<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: cleaning character columns in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/cleaning-character-columns/m-p/387240#M92825</link>
    <description>&lt;P&gt;Write down every rule you want to apply to the variable Addresses, then start coding.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Maybe deleting the unwanted content is easier than extracting the required information, the last line of your example give that approach additional complexity.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Regular Expression seem to be the best way to extract the street names.&lt;/P&gt;</description>
    <pubDate>Fri, 11 Aug 2017 07:03:18 GMT</pubDate>
    <dc:creator>andreas_lds</dc:creator>
    <dc:date>2017-08-11T07:03:18Z</dc:date>
    <item>
      <title>cleaning character columns</title>
      <link>https://communities.sas.com/t5/SAS-Programming/cleaning-character-columns/m-p/387231#M92822</link>
      <description>&lt;P&gt;Hi everyone,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have a long list of addresses and I need to extract just the street name:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Dummy data set:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;U&gt;&lt;STRONG&gt;Addresses (column name)&lt;/STRONG&gt;&lt;/U&gt;&lt;/P&gt;&lt;P&gt;1000 Ngapenga rd&lt;/P&gt;&lt;P&gt;25 Gill Lane&lt;/P&gt;&lt;P&gt;po box 234&lt;/P&gt;&lt;P&gt;174/H Mangatin drive&lt;/P&gt;&lt;P&gt;102b te hono st&lt;/P&gt;&lt;P&gt;Te pahu rd&lt;/P&gt;&lt;P&gt;162 No 2 rd&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I want the extract street name to look like&lt;/P&gt;&lt;P&gt;&lt;U&gt;&lt;STRONG&gt;Street_name:&lt;/STRONG&gt;&lt;/U&gt;&lt;/P&gt;&lt;P&gt;Ngapenga&lt;/P&gt;&lt;P&gt;Gill&lt;/P&gt;&lt;P&gt;Mangatin&lt;/P&gt;&lt;P&gt;Te Hono&lt;/P&gt;&lt;P&gt;Tepahu&lt;/P&gt;&lt;P&gt;No 2&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;My code is currently below:&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;set customer_addy;&lt;BR /&gt;x = anydigit(addresses,1);&lt;BR /&gt;if x = 1 then street_name = substr(addresses,2,length(scan(addresses,2, ' ')));&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I cannot get my head around how to taken into account all the many conditions. Any help is appreciated.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;</description>
      <pubDate>Fri, 11 Aug 2017 06:06:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/cleaning-character-columns/m-p/387231#M92822</guid>
      <dc:creator>Scott86</dc:creator>
      <dc:date>2017-08-11T06:06:19Z</dc:date>
    </item>
    <item>
      <title>Re: cleaning character columns</title>
      <link>https://communities.sas.com/t5/SAS-Programming/cleaning-character-columns/m-p/387238#M92823</link>
      <description>&lt;P&gt;You may try use &lt;STRONG&gt;translate&lt;/STRONG&gt; in order to replace numers into space, and&lt;/P&gt;
&lt;P&gt;use &lt;STRONG&gt;tranword&lt;/STRONG&gt; to replace constants - like ' rd ', ' st ', ' road ', ' street ', ' lane ', etc. &amp;nbsp;- into spaces,&lt;/P&gt;
&lt;P&gt;being aware of lowcase/uppercase, than use &lt;STRONG&gt;compbl&lt;/STRONG&gt; the result and check&lt;/P&gt;
&lt;P&gt;is ther more to do.&lt;/P&gt;</description>
      <pubDate>Fri, 11 Aug 2017 07:00:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/cleaning-character-columns/m-p/387238#M92823</guid>
      <dc:creator>Shmuel</dc:creator>
      <dc:date>2017-08-11T07:00:09Z</dc:date>
    </item>
    <item>
      <title>Re: cleaning character columns</title>
      <link>https://communities.sas.com/t5/SAS-Programming/cleaning-character-columns/m-p/387240#M92825</link>
      <description>&lt;P&gt;Write down every rule you want to apply to the variable Addresses, then start coding.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Maybe deleting the unwanted content is easier than extracting the required information, the last line of your example give that approach additional complexity.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Regular Expression seem to be the best way to extract the street names.&lt;/P&gt;</description>
      <pubDate>Fri, 11 Aug 2017 07:03:18 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/cleaning-character-columns/m-p/387240#M92825</guid>
      <dc:creator>andreas_lds</dc:creator>
      <dc:date>2017-08-11T07:03:18Z</dc:date>
    </item>
    <item>
      <title>Re: cleaning character columns</title>
      <link>https://communities.sas.com/t5/SAS-Programming/cleaning-character-columns/m-p/387253#M92833</link>
      <description>&lt;P&gt;What is your final objective with cleaning address data? Is it by chance anything to do with address matching? If so there are tools and services available that cleanse, standardise and match addresses to a much higher level of quality than you are ever likely to achieve yourself.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Your addresses look like New Zealand ones. There are tools available with NZ address localisation that can do what you require without any coding, for example SAS's Dataflux.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 11 Aug 2017 07:57:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/cleaning-character-columns/m-p/387253#M92833</guid>
      <dc:creator>SASKiwi</dc:creator>
      <dc:date>2017-08-11T07:57:15Z</dc:date>
    </item>
    <item>
      <title>Re: cleaning character columns</title>
      <link>https://communities.sas.com/t5/SAS-Programming/cleaning-character-columns/m-p/387256#M92835</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you have at your disposal a comprehensive list of possible street names, you can use it to match your list of adresses.&lt;/P&gt;</description>
      <pubDate>Fri, 11 Aug 2017 08:07:14 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/cleaning-character-columns/m-p/387256#M92835</guid>
      <dc:creator>gamotte</dc:creator>
      <dc:date>2017-08-11T08:07:14Z</dc:date>
    </item>
    <item>
      <title>Re: cleaning character columns</title>
      <link>https://communities.sas.com/t5/SAS-Programming/cleaning-character-columns/m-p/387495#M92914</link>
      <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;How do I use a transwrd function for multiple conditions.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;address = transwrd(upcase(addresses), 'ST', ' ');&lt;BR /&gt;address = transwrd(upcase(addresses), 'DR', ' ');&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This code only takes the last entry. If I create multiple variables i.e. address1, address2 then I have to different varables which I need in 1 column.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Any help is appreciated&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 11 Aug 2017 21:16:03 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/cleaning-character-columns/m-p/387495#M92914</guid>
      <dc:creator>Scott86</dc:creator>
      <dc:date>2017-08-11T21:16:03Z</dc:date>
    </item>
    <item>
      <title>Re: cleaning character columns</title>
      <link>https://communities.sas.com/t5/SAS-Programming/cleaning-character-columns/m-p/387519#M92915</link>
      <description>&lt;P&gt;The function to replace a word is:&amp;nbsp;&lt;STRONG&gt;TRANWRD&amp;nbsp;&lt;/STRONG&gt;(not tran&lt;STRONG&gt;s&lt;/STRONG&gt;word).&lt;!-- &amp;lt;meta prod="SAS 9.2 Language Reference: Dictionary" title="TRANWRD Function" url="../lrdict.hlp/a000215027.htm "&gt; --&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;To multiple replacements, you can do:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;address = addresses;
address = tranwrd(upcase(address), ' ST', ' ');
address = tranwrd(upcase(address), ' DR', ' ');
address = compbl(address);
 &lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;I have added a space before the 'ST', 'DR' - to eliminate replacement in case those are substrings&amp;nbsp;&lt;/P&gt;
&lt;P&gt;(think of EA&lt;STRONG&gt;ST&lt;/STRONG&gt;ERN, AN&lt;STRONG&gt;DR&lt;/STRONG&gt;E)&lt;/P&gt;
&lt;DIV class="sgml"&gt;
&lt;H1&gt;&amp;nbsp;&lt;/H1&gt;
&lt;/DIV&gt;</description>
      <pubDate>Fri, 11 Aug 2017 23:56:06 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/cleaning-character-columns/m-p/387519#M92915</guid>
      <dc:creator>Shmuel</dc:creator>
      <dc:date>2017-08-11T23:56:06Z</dc:date>
    </item>
  </channel>
</rss>

