<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Address Matching in SAS Procedures</title>
    <link>https://communities.sas.com/t5/SAS-Procedures/Address-Matching/m-p/47891#M12903</link>
    <description>Hi:&lt;BR /&gt;
  I know that the DataFlux tool, available as part of the Data Integration Studio can be used to "clean up" data like you show. &lt;BR /&gt;
 &lt;BR /&gt;
  For more information, refer to:&lt;BR /&gt;
&lt;A href="http://www.dataflux.com/home.aspx?lang=en-us" target="_blank"&gt;http://www.dataflux.com/home.aspx?lang=en-us&lt;/A&gt;&lt;BR /&gt;
&lt;A href="http://www.sas.com/data-quality/df-integration-server/index.html" target="_blank"&gt;http://www.sas.com/data-quality/df-integration-server/index.html&lt;/A&gt;&lt;BR /&gt;
&lt;A href="http://www.sas.com/technologies/dw/index.html" target="_blank"&gt;http://www.sas.com/technologies/dw/index.html&lt;/A&gt;&lt;BR /&gt;
 &lt;BR /&gt;
I'm not sure whether the DataFlux Studio is still a standalone product or not. You may want to check with your Sales Rep or with Tech Support on this.&lt;BR /&gt;
&lt;BR /&gt;
cynthia</description>
    <pubDate>Thu, 09 Dec 2010 20:58:24 GMT</pubDate>
    <dc:creator>Cynthia_sas</dc:creator>
    <dc:date>2010-12-09T20:58:24Z</dc:date>
    <item>
      <title>Address Matching</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Address-Matching/m-p/47889#M12901</link>
      <description>Here are two address:&lt;BR /&gt;
&lt;BR /&gt;
128 W. Main Street, Noland, NW&lt;BR /&gt;
128 West Main St., Noland, NW&lt;BR /&gt;
&lt;BR /&gt;
There are the same address. I want to know if SAS has any tool to tell me they are the same. The addresses can be a lot more complicated than the above. It is hard to come up the rules and put into a program to match them all.</description>
      <pubDate>Thu, 09 Dec 2010 20:08:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Address-Matching/m-p/47889#M12901</guid>
      <dc:creator>MarcTC</dc:creator>
      <dc:date>2010-12-09T20:08:37Z</dc:date>
    </item>
    <item>
      <title>Re: Address Matching</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Address-Matching/m-p/47890#M12902</link>
      <description>Hello MarcTC,&lt;BR /&gt;
&lt;BR /&gt;
I think that this can be achieved by formats. However, you still have to determine all rules to identify possible variations. For example, for street:&lt;BR /&gt;
[pre]&lt;BR /&gt;
proc format;&lt;BR /&gt;
   value $st&lt;BR /&gt;
   "ST."="Street"&lt;BR /&gt;
   "ST"="Street"&lt;BR /&gt;
   "Street"="Street"&lt;BR /&gt;
 &lt;OTHER possible="" variants=""&gt;;&lt;BR /&gt;
run;&lt;BR /&gt;
[/pre]&lt;BR /&gt;
Sincerely,&lt;BR /&gt;
SPR&lt;/OTHER&gt;</description>
      <pubDate>Thu, 09 Dec 2010 20:46:01 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Address-Matching/m-p/47890#M12902</guid>
      <dc:creator>SPR</dc:creator>
      <dc:date>2010-12-09T20:46:01Z</dc:date>
    </item>
    <item>
      <title>Re: Address Matching</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Address-Matching/m-p/47891#M12903</link>
      <description>Hi:&lt;BR /&gt;
  I know that the DataFlux tool, available as part of the Data Integration Studio can be used to "clean up" data like you show. &lt;BR /&gt;
 &lt;BR /&gt;
  For more information, refer to:&lt;BR /&gt;
&lt;A href="http://www.dataflux.com/home.aspx?lang=en-us" target="_blank"&gt;http://www.dataflux.com/home.aspx?lang=en-us&lt;/A&gt;&lt;BR /&gt;
&lt;A href="http://www.sas.com/data-quality/df-integration-server/index.html" target="_blank"&gt;http://www.sas.com/data-quality/df-integration-server/index.html&lt;/A&gt;&lt;BR /&gt;
&lt;A href="http://www.sas.com/technologies/dw/index.html" target="_blank"&gt;http://www.sas.com/technologies/dw/index.html&lt;/A&gt;&lt;BR /&gt;
 &lt;BR /&gt;
I'm not sure whether the DataFlux Studio is still a standalone product or not. You may want to check with your Sales Rep or with Tech Support on this.&lt;BR /&gt;
&lt;BR /&gt;
cynthia</description>
      <pubDate>Thu, 09 Dec 2010 20:58:24 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Address-Matching/m-p/47891#M12903</guid>
      <dc:creator>Cynthia_sas</dc:creator>
      <dc:date>2010-12-09T20:58:24Z</dc:date>
    </item>
    <item>
      <title>Re: Address Matching</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Address-Matching/m-p/47892#M12904</link>
      <description>I'm not sure that SAS has that many built in tools (though it does have the functions that allow one to build them).  However, there is lots of commercial software for doing "address normalization."  I am familiar with them for US mailings (I've used this company, but there are others, &lt;A href="http://www.melissadata.com/" target="_blank"&gt;http://www.melissadata.com/&lt;/A&gt; ).&lt;BR /&gt;
&lt;BR /&gt;
If you first ran all of your addresses through a program like they sell, then the matching process would be much simpler.</description>
      <pubDate>Fri, 10 Dec 2010 03:05:04 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Address-Matching/m-p/47892#M12904</guid>
      <dc:creator>Doc_Duke</dc:creator>
      <dc:date>2010-12-10T03:05:04Z</dc:date>
    </item>
    <item>
      <title>Re: Address Matching</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Address-Matching/m-p/47893#M12905</link>
      <description>Can Proc Geocode do such address standardization/correction/normalization? I don't have SAS 9.2. TS2M3, so can't test it out.&lt;BR /&gt;
&lt;BR /&gt;
According to some internet posts, Google MAP API can do this task. SAS has a product called SAS Google Map Generator. I wonder if this generator allows users to access Google MAP's address normalization function.</description>
      <pubDate>Fri, 10 Dec 2010 04:12:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Address-Matching/m-p/47893#M12905</guid>
      <dc:creator>MarcTC</dc:creator>
      <dc:date>2010-12-10T04:12:50Z</dc:date>
    </item>
    <item>
      <title>Re: Address Matching</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Address-Matching/m-p/47894#M12906</link>
      <description>Hi&lt;BR /&gt;
&lt;BR /&gt;
I believe with SAS it's either the data quality server (data flux) or you have to develop a set of Regular Expressions (which will be painfull).&lt;BR /&gt;
&lt;BR /&gt;
Just did a quick google search. &lt;BR /&gt;
There are sites with RegEx patterns which might help you, i.e:&lt;BR /&gt;
&lt;A href="http://regexlib.com/DisplayPatterns.aspx?categoryId=7&amp;amp;cattabindex=6" target="_blank"&gt;http://regexlib.com/DisplayPatterns.aspx?categoryId=7&amp;amp;cattabindex=6&lt;/A&gt;&lt;BR /&gt;
&lt;BR /&gt;
My thinking is:&lt;BR /&gt;
You could use PRXCHANGE() to transform similar patterns to one standard string - and then compare these standard strings.&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
HTH&lt;BR /&gt;
Patrick&lt;BR /&gt;
&lt;BR /&gt;
Message was edited by: Patrick</description>
      <pubDate>Fri, 10 Dec 2010 07:19:24 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Address-Matching/m-p/47894#M12906</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2010-12-10T07:19:24Z</dc:date>
    </item>
    <item>
      <title>Re: Address Matching</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Address-Matching/m-p/47895#M12907</link>
      <description>Just check online articles. It seems PROC GEOCODE's street level address geocoding only produce X/Y coordinateness and no normalized address. Can I use X/Y coordinates to match the data?</description>
      <pubDate>Fri, 10 Dec 2010 07:41:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Address-Matching/m-p/47895#M12907</guid>
      <dc:creator>MarcTC</dc:creator>
      <dc:date>2010-12-10T07:41:28Z</dc:date>
    </item>
    <item>
      <title>Re: Address Matching</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Address-Matching/m-p/47896#M12908</link>
      <description>Doc and other resources for SAS Data Quality Server are on its Software Product page:&lt;BR /&gt;
&lt;BR /&gt;
&lt;A href="http://support.sas.com/software/products/dataqual/" target="_blank"&gt;http://support.sas.com/software/products/dataqual/&lt;/A&gt;&lt;BR /&gt;
&lt;BR /&gt;
SAS Data Quality server is sold as part of the SAS Data Quality Solution, and it is also available through the software offerings, SAS Data Integration Server and SAS Enterprise Data Integration Server: &lt;BR /&gt;
&lt;A href="http://support.sas.com/documentation/onlinedoc/dis/" target="_blank"&gt;http://support.sas.com/documentation/onlinedoc/dis/&lt;/A&gt;&lt;BR /&gt;
&lt;A href="http://support.sas.com/documentation/onlinedoc/entdis/" target="_blank"&gt;http://support.sas.com/documentation/onlinedoc/entdis/&lt;/A&gt;&lt;BR /&gt;
&lt;BR /&gt;
These offerings include DataFlux products that the SAS Data Quality Server interacts with.</description>
      <pubDate>Mon, 13 Dec 2010 20:46:04 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Address-Matching/m-p/47896#M12908</guid>
      <dc:creator>SusanJ516_sas</dc:creator>
      <dc:date>2010-12-13T20:46:04Z</dc:date>
    </item>
    <item>
      <title>Re: Address Matching</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Address-Matching/m-p/47897#M12909</link>
      <description>I have done this process before and the basic procedure was to first of all separate out the parts of the address into their own fields (ie. street_address, city, state, zip). The easiest way to do the former is with regular expressions.&lt;BR /&gt;
&lt;BR /&gt;
Secondly you will need to standardize the wording in the addresses, eg. substitute all abbreviations with the full name.  In your example above, if you came across "W" or "W." or "Wst" etc in your street_address field change them all to "West".  The easiest way to get a list of common abbreviations is to 'tokenize' the entire address so that you get a frequency count of the words used in the dataset.  Common abbreviations like Rd, Ln, St etc will bubble to the top.  You then manually make a mapping using whatever technique you like best.&lt;BR /&gt;
&lt;BR /&gt;
Lastly you can use the SAS soundex() function to identify addresses that are the same but may contain typos or misspellings.  Ie.  Main Street, Main Streat, Maine Street would all be considered the same using the soundex() function.  When you have a match on say the soundex(street_address) + zip + name you can be reasonably certain that it is the same address even when they have misspellings and/or typos.&lt;BR /&gt;
&lt;BR /&gt;
Hope this helps.&lt;BR /&gt;
&lt;BR /&gt;
Cheers&lt;BR /&gt;
Rob</description>
      <pubDate>Mon, 13 Dec 2010 21:50:57 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Address-Matching/m-p/47897#M12909</guid>
      <dc:creator>r_bomb</dc:creator>
      <dc:date>2010-12-13T21:50:57Z</dc:date>
    </item>
  </channel>
</rss>

