<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Proc SQL - Duplicate Findings (SOUNDEX) in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-Duplicate-Findings-SOUNDEX/m-p/161002#M31324</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi Everyone,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;National_ID&amp;nbsp; TelePhone&lt;/P&gt;&lt;P&gt;11111 9833293682&lt;/P&gt;&lt;P&gt;11111 9833293682&lt;/P&gt;&lt;P&gt;22222 9833293682&lt;/P&gt;&lt;P&gt;33333 9833293682&lt;/P&gt;&lt;P&gt;44444 9902247880&lt;/P&gt;&lt;P&gt;55555 9902247880&lt;/P&gt;&lt;P&gt;66666 9999999999&lt;/P&gt;&lt;P&gt;77777 8888888888&lt;/P&gt;&lt;P&gt;77777 8888888888&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Here we have Ids and Telephone_Num, National_ID is the Unique and submited their Phone number.&lt;/P&gt;&lt;P&gt;I am looking for different National_ID submitted same Phone numbers list.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Output Should be:-&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 10pt; line-height: 1.5em;"&gt;National_ID&amp;nbsp; TelePhone&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;11111&amp;nbsp;&amp;nbsp; 9833293682&lt;/P&gt;&lt;P&gt;22222&amp;nbsp;&amp;nbsp; 9833293682&lt;/P&gt;&lt;P&gt;33333&amp;nbsp;&amp;nbsp; 9833293682&lt;/P&gt;&lt;P&gt;44444&amp;nbsp;&amp;nbsp; 9902247880&lt;/P&gt;&lt;P&gt;55555&amp;nbsp;&amp;nbsp; 9902247880&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;here -&amp;nbsp; 11111,22222 and 33333 having same(9833293682) TelePhone and &lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 44444,55555 having same( 9902247880) TelePhone.&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 10pt; line-height: 1.5em;"&gt;&lt;BR /&gt;Thanks in Advance....!&lt;/SPAN&gt;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Fri, 02 Jan 2015 11:15:28 GMT</pubDate>
    <dc:creator>sas_lak</dc:creator>
    <dc:date>2015-01-02T11:15:28Z</dc:date>
    <item>
      <title>Proc SQL - Duplicate Findings (SOUNDEX)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-Duplicate-Findings-SOUNDEX/m-p/160995#M31317</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;Need to find out the Duplicate Telephone Numbers by National_ID and NAME,&lt;/P&gt;&lt;P&gt;A) Tele_dup&lt;/P&gt;&lt;P&gt;1.BY NATIONAL_ID&lt;/P&gt;&lt;TABLE border="0" cellpadding="0" cellspacing="0" style="width: 576px;"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD class="xl66" height="20" width="80"&gt;National_ID&lt;/TD&gt;&lt;TD class="xl67" width="118"&gt;&amp;nbsp; Customer_Name&lt;/TD&gt;&lt;TD class="xl67" width="144"&gt;&amp;nbsp; Telephone&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD class="xl65" height="20"&gt;1&lt;/TD&gt;&lt;TD&gt;DEVID&lt;/TD&gt;&lt;TD&gt;+19 99999999&lt;/TD&gt;&lt;TD class="xl65" colspan="2"&gt;&lt;STRONG&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; Out put should like this&lt;/STRONG&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD class="xl65" height="20"&gt;2&lt;/TD&gt;&lt;TD&gt;MARK&lt;/TD&gt;&lt;TD&gt;+19 99999999&lt;/TD&gt;&lt;TD class="xl66"&gt;National_ID&lt;/TD&gt;&lt;TD class="xl67"&gt;&amp;nbsp; Telephone&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD class="xl65" height="20"&gt;3&lt;/TD&gt;&lt;TD&gt;JIM&lt;/TD&gt;&lt;TD&gt;+19 8888888&lt;/TD&gt;&lt;TD align="right"&gt;1&lt;/TD&gt;&lt;TD&gt;+19 99999999&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD class="xl65" height="20"&gt;1&lt;/TD&gt;&lt;TD&gt;DEVID&lt;/TD&gt;&lt;TD&gt;+19 7777777&lt;/TD&gt;&lt;TD align="right"&gt;2&lt;/TD&gt;&lt;TD&gt;+19 99999999&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD class="xl65" height="20"&gt;2&lt;/TD&gt;&lt;TD&gt;MARK HENRY&lt;/TD&gt;&lt;TD&gt;+19 99999999&lt;/TD&gt;&lt;TD align="right"&gt;7&lt;/TD&gt;&lt;TD&gt;+19 99999999&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD class="xl65" height="20"&gt;4&lt;/TD&gt;&lt;TD&gt;THOMAS&lt;/TD&gt;&lt;TD&gt;+19 6666666&lt;/TD&gt;&lt;TD align="right"&gt;4&lt;/TD&gt;&lt;TD&gt;+19 6666666&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD class="xl65" height="20"&gt;5&lt;/TD&gt;&lt;TD&gt;BALE&lt;/TD&gt;&lt;TD&gt;+19 5555555&lt;/TD&gt;&lt;TD align="right"&gt;6&lt;/TD&gt;&lt;TD&gt;+19 6666666&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD class="xl65" height="20"&gt;6&lt;/TD&gt;&lt;TD&gt;PITT&lt;/TD&gt;&lt;TD&gt;+19 6666666&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD class="xl65" height="20"&gt;7&lt;/TD&gt;&lt;TD&gt;WOOD&lt;/TD&gt;&lt;TD&gt;+19 99999999&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;B) Name_Dups&lt;/P&gt;&lt;P&gt;2. By Customer_Name.&lt;/P&gt;&lt;TABLE border="0" cellpadding="0" cellspacing="0" height="289" style="width: 578px; height: 293px;"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD class="xl66" height="20" width="80"&gt;National_ID&lt;/TD&gt;&lt;TD class="xl67" width="118"&gt;&amp;nbsp; Customer_Name&lt;/TD&gt;&lt;TD class="xl67" width="144"&gt;&amp;nbsp; Telephone&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD class="xl65" height="20"&gt;1&lt;/TD&gt;&lt;TD&gt;MASS&lt;/TD&gt;&lt;TD&gt;+19 99999999&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD class="xl65" height="20"&gt;1&lt;/TD&gt;&lt;TD&gt;MOSS&lt;/TD&gt;&lt;TD&gt;+19 99999999&lt;/TD&gt;&lt;TD class="xl65" colspan="2"&gt;&lt;STRONG&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Out put should like this&lt;/STRONG&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD class="xl65" height="20"&gt;2&lt;/TD&gt;&lt;TD&gt;DEVID&lt;/TD&gt;&lt;TD&gt;+19 8888888&lt;/TD&gt;&lt;TD class="xl66"&gt;National_ID&lt;/TD&gt;&lt;TD class="xl67"&gt;&amp;nbsp; Telephone&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD class="xl65" height="20"&gt;3&lt;/TD&gt;&lt;TD&gt;DEVID THOMAS&lt;/TD&gt;&lt;TD&gt;+19 7777777&lt;/TD&gt;&lt;TD class="xl65"&gt;1&lt;/TD&gt;&lt;TD&gt;+19 99999999&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD class="xl65" height="20"&gt;4&lt;/TD&gt;&lt;TD&gt;JOHN&lt;/TD&gt;&lt;TD&gt;+19 99999999&lt;/TD&gt;&lt;TD class="xl65"&gt;4&lt;/TD&gt;&lt;TD&gt;+19 99999999&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD class="xl65" height="20"&gt;5&lt;/TD&gt;&lt;TD&gt;HENRY&lt;/TD&gt;&lt;TD&gt;+19 6666666&lt;/TD&gt;&lt;TD class="xl65"&gt;8&lt;/TD&gt;&lt;TD&gt;+19 99999999&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD class="xl65" height="20"&gt;6&lt;/TD&gt;&lt;TD&gt;WOOD&lt;/TD&gt;&lt;TD&gt;+19 5555555&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD class="xl65" height="20"&gt;7&lt;/TD&gt;&lt;TD&gt;MARK HENRY&lt;/TD&gt;&lt;TD&gt;+19 6666666&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD class="xl65" height="20"&gt;8&lt;/TD&gt;&lt;TD&gt;WOOD&lt;/TD&gt;&lt;TD&gt;+19 99999999&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;A)&lt;/P&gt;&lt;P&gt;proc sql ;&lt;/P&gt;&lt;P&gt;create table Dups_BY_ID as&lt;/P&gt;&lt;P&gt;select distinct a.National_ID,a.Telephone&lt;/P&gt;&lt;P&gt;from Tele_dup as a, Tele_dup as b&lt;/P&gt;&lt;P&gt;where&lt;/P&gt;&lt;P&gt;a.National_ID ne b.National_ID and a.Telephone eq b.Telephone&lt;/P&gt;&lt;P&gt;order by 2;&lt;/P&gt;&lt;P&gt;quit;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;B)&lt;/P&gt;&lt;P&gt;proc sql ;&lt;/P&gt;&lt;P&gt;create table Dups_BY_Name as&lt;/P&gt;&lt;P&gt;select distinct a.National_ID,a.Telephone&lt;/P&gt;&lt;P&gt;from Name_Dups as a, Name_Dups as b&lt;/P&gt;&lt;P&gt;where&lt;/P&gt;&lt;P&gt;soundex(a.Customer_Name) ne soundex(b.Customer_Name) and a.Telephone eq b.Telephone&lt;/P&gt;&lt;P&gt;order by 2;&lt;/P&gt;&lt;P&gt;quit;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;These two are giving the proper out put for small data, but same code run on huge data it is taking long time because it working as cartesian join,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Plz suggest any simple code for getting duplicats on huse data from single table.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks in Advance....!&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 23 Dec 2014 05:17:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-Duplicate-Findings-SOUNDEX/m-p/160995#M31317</guid>
      <dc:creator>sas_lak</dc:creator>
      <dc:date>2014-12-23T05:17:00Z</dc:date>
    </item>
    <item>
      <title>Re: Proc SQL - Duplicate Findings (SOUNDEX)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-Duplicate-Findings-SOUNDEX/m-p/160996#M31318</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Have you looked into the prod sort options? Or a hash table sort? &lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 23 Dec 2014 05:47:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-Duplicate-Findings-SOUNDEX/m-p/160996#M31318</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2014-12-23T05:47:46Z</dc:date>
    </item>
    <item>
      <title>Re: Proc SQL - Duplicate Findings (SOUNDEX)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-Duplicate-Findings-SOUNDEX/m-p/160997#M31319</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Not yet, but I want to get "same Phone number repeated for different persons(or differnt IDs)" from single table.&lt;/P&gt;&lt;P&gt;Could you please share the code for same.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 23 Dec 2014 06:07:47 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-Duplicate-Findings-SOUNDEX/m-p/160997#M31319</guid>
      <dc:creator>sas_lak</dc:creator>
      <dc:date>2014-12-23T06:07:47Z</dc:date>
    </item>
    <item>
      <title>Re: Proc SQL - Duplicate Findings (SOUNDEX)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-Duplicate-Findings-SOUNDEX/m-p/160998#M31320</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Even though proc sql doesn't require input tables to be sorted....it does help performance.&amp;nbsp; You could also create a covering index.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 23 Dec 2014 13:38:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-Duplicate-Findings-SOUNDEX/m-p/160998#M31320</guid>
      <dc:creator>DBailey</dc:creator>
      <dc:date>2014-12-23T13:38:13Z</dc:date>
    </item>
    <item>
      <title>Re: Proc SQL - Duplicate Findings (SOUNDEX)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-Duplicate-Findings-SOUNDEX/m-p/160999#M31321</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;This is an another option which will eliminate the Cartesian product and potentially help performance. Here you are joining all of the data with a subset of data that has duplicate telephone numbers. I just took your sample data and it worked great. For the full table, I would put an index on the telephone column since you are joining on it to improve performance.&lt;/P&gt;&lt;P style="font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; background-color: #ffffff;"&gt;&lt;/P&gt;&lt;P style="font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; background-color: #ffffff;"&gt;proc sql ;&lt;/P&gt;&lt;P style="font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; background-color: #ffffff;"&gt;create table Dups_BY_ID as&lt;/P&gt;&lt;P&gt;SELECT &lt;/P&gt;&lt;P&gt;&amp;nbsp; a.national_id, &lt;/P&gt;&lt;P&gt;&amp;nbsp; a.telephone &lt;/P&gt;&lt;P&gt;FROM &lt;/P&gt;&lt;P&gt;&amp;nbsp; tele_dup a&lt;/P&gt;&lt;P&gt;where &lt;/P&gt;&lt;P&gt;&amp;nbsp; a.telephone in (&lt;/P&gt;&lt;P&gt;&amp;nbsp; select &lt;/P&gt;&lt;P&gt;&amp;nbsp; telephone&lt;/P&gt;&lt;P&gt;&amp;nbsp; from &lt;/P&gt;&lt;P&gt;&amp;nbsp; tele_dup&lt;/P&gt;&lt;P&gt;&amp;nbsp; group by &lt;/P&gt;&lt;P&gt;&amp;nbsp; telephone&lt;/P&gt;&lt;P&gt;&amp;nbsp; having &lt;/P&gt;&lt;P&gt;&amp;nbsp; count(*) &amp;gt; 1&lt;/P&gt;&lt;P&gt;)&lt;/P&gt;&lt;P&gt;&amp;nbsp; order by 2 desc;&lt;/P&gt;&lt;P&gt;quit;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 23 Dec 2014 14:25:49 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-Duplicate-Findings-SOUNDEX/m-p/160999#M31321</guid>
      <dc:creator>skillman</dc:creator>
      <dc:date>2014-12-23T14:25:49Z</dc:date>
    </item>
    <item>
      <title>Re: Proc SQL - Duplicate Findings (SOUNDEX)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-Duplicate-Findings-SOUNDEX/m-p/161000#M31322</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Thank You,&lt;/P&gt;&lt;P&gt;Compare to my code what am using earlier it was more efficient&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;But above code is giving&amp;nbsp; &lt;STRONG&gt;"SAME ID having SAME Phone Numbers";&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;Ex:-&lt;/P&gt;&lt;P&gt;&lt;STRONG style="background-color: #ffffff; font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif;"&gt;National_ID&amp;nbsp;&amp;nbsp; &lt;SPAN style="font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; background-color: #ffffff;"&gt;Telephone&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;100&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; +19 99999999&lt;/P&gt;&lt;P&gt;100&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; +19 99999999&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; here Same Phone Repeated for Same IDs&lt;/P&gt;&lt;P&gt;200&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; +19 77777777&lt;/P&gt;&lt;P&gt;200&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; +19 77777777&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Iam Looking for &lt;STRONG&gt;"Same Phone Number is repeated for Differnt IDs"&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Ex:-&lt;BR /&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG style="background-color: #ffffff; font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif;"&gt;National_ID&amp;nbsp;&amp;nbsp; &lt;SPAN style="font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; background-color: #ffffff;"&gt;Telephone&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;100&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; +19 99999999&lt;/P&gt;&lt;P&gt;200&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; +19 99999999&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; "Same Phone repeated for Different IDs&lt;/P&gt;&lt;P&gt;XXX&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; +19 77777777&lt;/P&gt;&lt;P&gt;YYY&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; +19 77777777&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 24 Dec 2014 04:30:26 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-Duplicate-Findings-SOUNDEX/m-p/161000#M31322</guid>
      <dc:creator>sas_lak</dc:creator>
      <dc:date>2014-12-24T04:30:26Z</dc:date>
    </item>
    <item>
      <title>Re: Proc SQL - Duplicate Findings (SOUNDEX)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-Duplicate-Findings-SOUNDEX/m-p/161001#M31323</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Try this one&lt;/P&gt;&lt;P&gt;proc sql;&lt;/P&gt;&lt;P&gt;&amp;nbsp; create table want(drop = tot) as&lt;/P&gt;&lt;P&gt;&amp;nbsp; select national_id,telephone,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; count(telephone) as tot&lt;/P&gt;&lt;P&gt;&amp;nbsp; from have&lt;/P&gt;&lt;P&gt;&amp;nbsp; group by 1,2&lt;/P&gt;&lt;P&gt;&amp;nbsp; having tot &amp;gt; 1;&lt;/P&gt;&lt;P&gt;quit;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;proc sql;&lt;/P&gt;&lt;P&gt;&amp;nbsp; create table want1(drop = tot) as&lt;/P&gt;&lt;P&gt;&amp;nbsp; select customer_name,telephone,national_id,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; count(telephone) as tot&lt;/P&gt;&lt;P&gt;&amp;nbsp; from have&lt;/P&gt;&lt;P&gt;&amp;nbsp; group by 1,2&lt;/P&gt;&lt;P&gt;&amp;nbsp; having tot &amp;gt; 1;&lt;/P&gt;&lt;P&gt;quit;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;proc append base = want data = want1 force;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 31 Dec 2014 17:06:10 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-Duplicate-Findings-SOUNDEX/m-p/161001#M31323</guid>
      <dc:creator>UrvishShah</dc:creator>
      <dc:date>2014-12-31T17:06:10Z</dc:date>
    </item>
    <item>
      <title>Re: Proc SQL - Duplicate Findings (SOUNDEX)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-Duplicate-Findings-SOUNDEX/m-p/161002#M31324</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi Everyone,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;National_ID&amp;nbsp; TelePhone&lt;/P&gt;&lt;P&gt;11111 9833293682&lt;/P&gt;&lt;P&gt;11111 9833293682&lt;/P&gt;&lt;P&gt;22222 9833293682&lt;/P&gt;&lt;P&gt;33333 9833293682&lt;/P&gt;&lt;P&gt;44444 9902247880&lt;/P&gt;&lt;P&gt;55555 9902247880&lt;/P&gt;&lt;P&gt;66666 9999999999&lt;/P&gt;&lt;P&gt;77777 8888888888&lt;/P&gt;&lt;P&gt;77777 8888888888&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Here we have Ids and Telephone_Num, National_ID is the Unique and submited their Phone number.&lt;/P&gt;&lt;P&gt;I am looking for different National_ID submitted same Phone numbers list.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Output Should be:-&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 10pt; line-height: 1.5em;"&gt;National_ID&amp;nbsp; TelePhone&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;11111&amp;nbsp;&amp;nbsp; 9833293682&lt;/P&gt;&lt;P&gt;22222&amp;nbsp;&amp;nbsp; 9833293682&lt;/P&gt;&lt;P&gt;33333&amp;nbsp;&amp;nbsp; 9833293682&lt;/P&gt;&lt;P&gt;44444&amp;nbsp;&amp;nbsp; 9902247880&lt;/P&gt;&lt;P&gt;55555&amp;nbsp;&amp;nbsp; 9902247880&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;here -&amp;nbsp; 11111,22222 and 33333 having same(9833293682) TelePhone and &lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 44444,55555 having same( 9902247880) TelePhone.&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 10pt; line-height: 1.5em;"&gt;&lt;BR /&gt;Thanks in Advance....!&lt;/SPAN&gt;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 02 Jan 2015 11:15:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-Duplicate-Findings-SOUNDEX/m-p/161002#M31324</guid>
      <dc:creator>sas_lak</dc:creator>
      <dc:date>2015-01-02T11:15:28Z</dc:date>
    </item>
    <item>
      <title>Re: Proc SQL - Duplicate Findings (SOUNDEX)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-Duplicate-Findings-SOUNDEX/m-p/161003#M31325</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;PRE&gt;data have;
input National_ID&amp;nbsp; TelePhone : $40.;
cards;
11111 9833293682
11111 9833293682
22222 9833293682
33333 9833293682
44444 9902247880
55555 9902247880
66666 9999999999
77777 8888888888
77777 8888888888
;
run;
proc sql;
 select distinct * from have
&amp;nbsp; group by &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;TelePhone
&amp;nbsp;&amp;nbsp; having count(distinct&amp;nbsp; National_ID) gt 1;
quit;
 


&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Xia Keshan&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 02 Jan 2015 11:45:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-Duplicate-Findings-SOUNDEX/m-p/161003#M31325</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2015-01-02T11:45:29Z</dc:date>
    </item>
    <item>
      <title>Re: Proc SQL - Duplicate Findings (SOUNDEX)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-Duplicate-Findings-SOUNDEX/m-p/161004#M31326</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;If datastep:&lt;/P&gt;&lt;P&gt;data have;&lt;BR /&gt;input National_ID $&amp;nbsp; TelePhone;&lt;BR /&gt;datalines;&lt;BR /&gt;11111 9833293682&lt;BR /&gt;11111 9833293682&lt;BR /&gt;22222 9833293682&lt;BR /&gt;33333 9833293682&lt;BR /&gt;44444 9902247880&lt;BR /&gt;55555 9902247880&lt;BR /&gt;66666 9999999999&lt;BR /&gt;77777 8888888888&lt;BR /&gt;77777 8888888888&lt;BR /&gt;;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;data want;&lt;BR /&gt;set have;&lt;BR /&gt;by national_id telephone notsorted;&lt;BR /&gt;retain TelePhone2;&lt;BR /&gt;if first.national_id then telephone2=telephone;&lt;BR /&gt;if not first.national_id and telephone=telephone2 then delete;&lt;BR /&gt;drop telephone2;&lt;BR /&gt;run;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 02 Jan 2015 11:55:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-Duplicate-Findings-SOUNDEX/m-p/161004#M31326</guid>
      <dc:creator>naveen_srini</dc:creator>
      <dc:date>2015-01-02T11:55:53Z</dc:date>
    </item>
    <item>
      <title>Re: Proc SQL - Duplicate Findings (SOUNDEX)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-Duplicate-Findings-SOUNDEX/m-p/161005#M31327</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Thanks &lt;SPAN style="font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; background-color: #ffffff;"&gt;Xia Keshan,&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;Run the above &lt;SPAN style="color: #545454; font-family: arial, sans-serif; font-size: small; background-color: #ffffff;"&gt;quire on 30 millions data, it is taken hardly not more than 10 min and I checked output randomly &lt;/SPAN&gt;it was looking good.&lt;/P&gt;&lt;P&gt;On the urgent basis I can apply this &lt;img id="smileyhappy" class="emoticon emoticon-smileyhappy" src="https://communities.sas.com/i/smilies/16x16_smiley-happy.png" alt="Smiley Happy" title="Smiley Happy" /&gt;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 02 Jan 2015 12:08:43 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-Duplicate-Findings-SOUNDEX/m-p/161005#M31327</guid>
      <dc:creator>sas_lak</dc:creator>
      <dc:date>2015-01-02T12:08:43Z</dc:date>
    </item>
    <item>
      <title>Re: Proc SQL - Duplicate Findings (SOUNDEX)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-Duplicate-Findings-SOUNDEX/m-p/161006#M31328</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I can't believe you can get it done within 10 minutes for 30 millions, that is incredible . I was thinking about data step for that big data .&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 02 Jan 2015 12:18:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-Duplicate-Findings-SOUNDEX/m-p/161006#M31328</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2015-01-02T12:18:33Z</dc:date>
    </item>
    <item>
      <title>Re: Proc SQL - Duplicate Findings (SOUNDEX)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-Duplicate-Findings-SOUNDEX/m-p/161007#M31329</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;For testing purpose run the quire on sample dataset. Observation count is 5316167 and am preparing the dataset for 30 millions.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;That's why I wondered&amp;nbsp; - IS IT OK......!&lt;/P&gt;&lt;P&gt;here I am sharing the log.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;NOTE: There were 3433276 observations read from the data set WORK.TELEPHONE_1.&lt;/P&gt;&lt;P&gt;NOTE: There were 1284869 observations read from the data set WORK.TELEPHONE_2.&lt;/P&gt;&lt;P&gt;NOTE: There were 562389 observations read from the data set WORK.TELEPHONE_3.&lt;/P&gt;&lt;P&gt;NOTE: There were 35406 observations read from the data set WORK.TELEPHONE_4.&lt;/P&gt;&lt;P&gt;NOTE: There were 227 observations read from the data set WORK.TELEPHONE_5.&lt;/P&gt;&lt;P&gt;NOTE: There were 0 observations read from the data set WORK.TELEPHONE_6.&lt;/P&gt;&lt;P&gt;NOTE: There were 0 observations read from the data set WORK.TELEPHONE_7.&lt;/P&gt;&lt;P&gt;NOTE: The data set WORK.PHONE has 5316167 observations and 3 variables.&lt;/P&gt;&lt;P&gt;NOTE: DATA statement used (Total process time):&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; real time&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1.37 seconds&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; cpu time&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1.34 seconds&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Total Records -&amp;nbsp; WORK.PHONE has&amp;nbsp;&amp;nbsp; 5316167&amp;nbsp;&amp;nbsp; observations and 3 variables&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;proc sql;&lt;/P&gt;&lt;P&gt;90&amp;nbsp;&amp;nbsp; create table dup as select distinct Telphoneno,fid from Phone&lt;/P&gt;&lt;P&gt;91&amp;nbsp;&amp;nbsp; group by Telphoneno&lt;/P&gt;&lt;P&gt;92&amp;nbsp;&amp;nbsp; having count(distinct&amp;nbsp; fid) gt 1&lt;/P&gt;&lt;P&gt;93&amp;nbsp;&amp;nbsp; order by Telphoneno;&lt;/P&gt;&lt;P&gt;NOTE: The query requires remerging summary statistics back with the original data.&lt;/P&gt;&lt;P&gt;NOTE: Table WORK.DUP created, with 1334130 rows and 2 columns.&lt;/P&gt;&lt;P&gt;94&amp;nbsp;&amp;nbsp; quit;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;NOTE: PROCEDURE SQL used (Total process time):&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; real time&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 35.28 seconds&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; cpu time&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 38.46 seconds&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;suggest me any other alternatives to get it.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 02 Jan 2015 12:47:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-Duplicate-Findings-SOUNDEX/m-p/161007#M31329</guid>
      <dc:creator>sas_lak</dc:creator>
      <dc:date>2015-01-02T12:47:29Z</dc:date>
    </item>
    <item>
      <title>Re: Proc SQL - Duplicate Findings (SOUNDEX)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-Duplicate-Findings-SOUNDEX/m-p/161008#M31330</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;You seem to run this on reasonable hardware and "cpu time" being higher than "real time" also shows that the process runs multi-threaded. If 10 minutes is good enough for you (and for your environment) then may be it's better to have simple code instead of tweaking the code for performance but then having more complex code to maintain. &lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sat, 03 Jan 2015 03:04:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-Duplicate-Findings-SOUNDEX/m-p/161008#M31330</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2015-01-03T03:04:30Z</dc:date>
    </item>
  </channel>
</rss>

