<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: deleting mixed duplicates in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/deleting-mixed-duplicates/m-p/510573#M137410</link>
    <description>Sorry Buddy &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;BR /&gt;Same errors as before ..&lt;BR /&gt;I guess SAS University Edition is quite restrictive and doesn't allow access to memory .. POKE won't work either ... but I shall persevere .. highly obliged and grateful for your time and engagement to my query ..</description>
    <pubDate>Mon, 05 Nov 2018 19:39:16 GMT</pubDate>
    <dc:creator>jfaruqui</dc:creator>
    <dc:date>2018-11-05T19:39:16Z</dc:date>
    <item>
      <title>deleting mixed duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/deleting-mixed-duplicates/m-p/510557#M137398</link>
      <description>&lt;P&gt;I have a dataset which has a number of variables including NAME.&lt;/P&gt;&lt;P&gt;I am trying to delete duplicate observations of name where one observation would be 'John Smith' and another observation would be 'smith john' .. they are clearly the same person and i want to delete the duplicate entry .. what would be the most efficient way to do it ?&lt;/P&gt;&lt;P&gt;considering also that the duplicate names could occur anywhere within the dataset.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Ex:&lt;/P&gt;&lt;P&gt;John Smith&lt;/P&gt;&lt;P&gt;Cal Harper&lt;/P&gt;&lt;P&gt;freddy Holt&lt;/P&gt;&lt;P&gt;smith john&lt;/P&gt;&lt;P&gt;frank waters&lt;/P&gt;&lt;P&gt;harper Cal&lt;/P&gt;</description>
      <pubDate>Mon, 05 Nov 2018 19:13:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/deleting-mixed-duplicates/m-p/510557#M137398</guid>
      <dc:creator>jfaruqui</dc:creator>
      <dc:date>2018-11-05T19:13:34Z</dc:date>
    </item>
    <item>
      <title>deleting duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/deleting-mixed-duplicates/m-p/510542#M137419</link>
      <description>&lt;P&gt;I have a dataset which has a number of variables including NAME.&lt;/P&gt;&lt;P&gt;I am trying to delete duplicate observations of name where one observation would be 'John Smith' and another observation would be 'smith john' .. they are clearly the same person and i want to delete the duplicate entry .. what would be the most efficient way to do it ?&lt;/P&gt;</description>
      <pubDate>Mon, 05 Nov 2018 18:50:01 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/deleting-mixed-duplicates/m-p/510542#M137419</guid>
      <dc:creator>jfaruqui</dc:creator>
      <dc:date>2018-11-05T18:50:01Z</dc:date>
    </item>
    <item>
      <title>Re: deleting duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/deleting-mixed-duplicates/m-p/510547#M137420</link>
      <description>&lt;P&gt;How do you determine a duplicate then? What about "Johnn Smith"?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Perhaps the &lt;A href="https://documentation.sas.com/?docsetId=lefunctionsref&amp;amp;docsetTarget=n0l41pdemybegln1oetsh4cctdap.htm&amp;amp;docsetVersion=9.4&amp;amp;locale=en" target="_self"&gt;COMPLEV&lt;/A&gt; or &lt;A href="https://documentation.sas.com/?docsetId=lefunctionsref&amp;amp;docsetTarget=p1r4l9jwgatggtn1ko81fyjys4s7.htm&amp;amp;docsetVersion=9.4&amp;amp;locale=en" target="_self"&gt;COMPGED&lt;/A&gt;&amp;nbsp;Function can be of help. These compute 'distances' between strings.&lt;/P&gt;</description>
      <pubDate>Mon, 05 Nov 2018 18:56:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/deleting-mixed-duplicates/m-p/510547#M137420</guid>
      <dc:creator>PeterClemmensen</dc:creator>
      <dc:date>2018-11-05T18:56:29Z</dc:date>
    </item>
    <item>
      <title>Re: deleting duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/deleting-mixed-duplicates/m-p/510553#M137421</link>
      <description>&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
length name $50;
name='smith john';
output;
name='John smith';
output;
name='Mcdonald John';
output;
name='John Mcdonald';
output;
run;

data t;
set have;
array t(50) $1 ;
call pokelong(compress(upcase(name)),addrlong(t(1)),50);
call sortc(of t(*));
w=cats(of t(*));
drop t:;
run;
proc sort data=t out=want(drop=w) nodupkey;
by w;
run; 
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 05 Nov 2018 19:06:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/deleting-mixed-duplicates/m-p/510553#M137421</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2018-11-05T19:06:19Z</dc:date>
    </item>
    <item>
      <title>Re: deleting duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/deleting-mixed-duplicates/m-p/510558#M137422</link>
      <description>&lt;P&gt;With temporary array,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data t;
set have;
array t(50) $1 _temporary_;
call missing(of t(*));
call pokelong(compress(upcase(name)),addrlong(t(1)),50);
call sortc(of t(*));
w=cats(of t(*));
run;
proc sort data=t out=want(drop=w) nodupkey;
by w;
run; 
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 05 Nov 2018 19:15:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/deleting-mixed-duplicates/m-p/510558#M137422</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2018-11-05T19:15:09Z</dc:date>
    </item>
    <item>
      <title>Re: deleting duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/deleting-mixed-duplicates/m-p/510560#M137423</link>
      <description>would this work even if the duplicates were far apart within the dataset ?? separated by many unique observations ?&lt;BR /&gt;</description>
      <pubDate>Mon, 05 Nov 2018 19:19:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/deleting-mixed-duplicates/m-p/510560#M137423</guid>
      <dc:creator>jfaruqui</dc:creator>
      <dc:date>2018-11-05T19:19:00Z</dc:date>
    </item>
    <item>
      <title>Re: deleting mixed duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/deleting-mixed-duplicates/m-p/510561#M137401</link>
      <description>&lt;P&gt;I responded the same question in your other thread&lt;/P&gt;
&lt;P&gt;&lt;A href="https://communities.sas.com/t5/SAS-Programming/deleting-duplicates/m-p/510553/highlight/false#M137396" target="_blank"&gt;https://communities.sas.com/t5/SAS-Programming/deleting-duplicates/m-p/510553/highlight/false#M137396&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
input name $50.;
cards;
John Smith
Cal Harper
freddy Holt
smith john
frank waters
harper Cal
;
run;

data t;
set have;
array t(50) $1 _temporary_;
call missing(of t(*));
call pokelong(compress(upcase(name)),addrlong(t(1)),50);
call sortc(of t(*));
w=cats(of t(*));
run;
proc sort data=t out=want(drop=w) nodupkey;
by w;
run; &lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 05 Nov 2018 19:19:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/deleting-mixed-duplicates/m-p/510561#M137401</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2018-11-05T19:19:15Z</dc:date>
    </item>
    <item>
      <title>Re: deleting duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/deleting-mixed-duplicates/m-p/510562#M137424</link>
      <description>&lt;P&gt;rather than sentences, can you give a good representative sample of what you have plz&lt;/P&gt;</description>
      <pubDate>Mon, 05 Nov 2018 19:20:01 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/deleting-mixed-duplicates/m-p/510562#M137424</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2018-11-05T19:20:01Z</dc:date>
    </item>
    <item>
      <title>Re: deleting mixed duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/deleting-mixed-duplicates/m-p/510563#M137403</link>
      <description>&lt;P&gt;I don't know how to merge the threads although I could request&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13879"&gt;@Reeza&lt;/a&gt;&amp;nbsp;/&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/11562"&gt;@Kurt_Bremser&lt;/a&gt;&amp;nbsp;to help merge the duplicate threads&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Going forward, Kindly edit in the same thread you started plz&lt;/P&gt;</description>
      <pubDate>Mon, 05 Nov 2018 19:23:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/deleting-mixed-duplicates/m-p/510563#M137403</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2018-11-05T19:23:12Z</dc:date>
    </item>
    <item>
      <title>Re: deleting mixed duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/deleting-mixed-duplicates/m-p/510565#M137404</link>
      <description>thanks bro .. sorry for the duplicate threads .. unfortunately I can't test your code on my dataset in SAS University Edition as I am getting these error messages:&lt;BR /&gt;&lt;BR /&gt;ERROR: The function POKELONG cannot be invoked when SAS is in the lockdown state.&lt;BR /&gt;ERROR: The function ADDRLONG cannot be invoked when SAS is in the lockdown state.&lt;BR /&gt;ERROR 251-185: The subroutine POKELONG is unknown, or cannot be accessed. Check your spelling.&lt;BR /&gt;Either it was not found in the path(s) of executable images, or there was incorrect or missing subroutine descriptor&lt;BR /&gt;information.&lt;BR /&gt;&lt;BR /&gt;ERROR 68-185: The function ADDRLONG is unknown, or cannot be accessed.&lt;BR /&gt;&lt;BR /&gt;But i am sure this solution would give the required result ... Thanks again bro !!</description>
      <pubDate>Mon, 05 Nov 2018 19:26:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/deleting-mixed-duplicates/m-p/510565#M137404</guid>
      <dc:creator>jfaruqui</dc:creator>
      <dc:date>2018-11-05T19:26:09Z</dc:date>
    </item>
    <item>
      <title>Re: deleting mixed duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/deleting-mixed-duplicates/m-p/510567#M137405</link>
      <description>&lt;P&gt;Hang on, if you are new/relatively new to SAS let alone APP, I beg your pardon, ignore the use of APP data management functions.&lt;/P&gt;</description>
      <pubDate>Mon, 05 Nov 2018 19:28:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/deleting-mixed-duplicates/m-p/510567#M137405</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2018-11-05T19:28:15Z</dc:date>
    </item>
    <item>
      <title>Re: deleting mixed duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/deleting-mixed-duplicates/m-p/510569#M137406</link>
      <description>So new/relatively new that this is first time I heard about APP functions .. fascinating stuff though .. reading about it just now ..</description>
      <pubDate>Mon, 05 Nov 2018 19:32:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/deleting-mixed-duplicates/m-p/510569#M137406</guid>
      <dc:creator>jfaruqui</dc:creator>
      <dc:date>2018-11-05T19:32:46Z</dc:date>
    </item>
    <item>
      <title>Re: deleting mixed duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/deleting-mixed-duplicates/m-p/510570#M137407</link>
      <description>&lt;P&gt;Ok, just try the 32 bit version--&amp;gt;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data t;
set have;
array t(50) $1 _temporary_;
call missing(of t(*));
call poke(compress(upcase(name)),addr(t(1)),50);
call sortc(of t(*));
w=cats(of t(*));
run;
proc sort data=t out=want(drop=w) nodupkey;
by w;
run; &lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;Test this and see if this works&lt;/P&gt;</description>
      <pubDate>Mon, 05 Nov 2018 19:37:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/deleting-mixed-duplicates/m-p/510570#M137407</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2018-11-05T19:37:39Z</dc:date>
    </item>
    <item>
      <title>Re: deleting mixed duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/deleting-mixed-duplicates/m-p/510573#M137410</link>
      <description>Sorry Buddy &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;BR /&gt;Same errors as before ..&lt;BR /&gt;I guess SAS University Edition is quite restrictive and doesn't allow access to memory .. POKE won't work either ... but I shall persevere .. highly obliged and grateful for your time and engagement to my query ..</description>
      <pubDate>Mon, 05 Nov 2018 19:39:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/deleting-mixed-duplicates/m-p/510573#M137410</guid>
      <dc:creator>jfaruqui</dc:creator>
      <dc:date>2018-11-05T19:39:16Z</dc:date>
    </item>
    <item>
      <title>Re: deleting mixed duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/deleting-mixed-duplicates/m-p/510574#M137411</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/98537"&gt;@jfaruqui&lt;/a&gt;&amp;nbsp; Ok Lets go linear&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;
data have;
input name $50.;
cards;
John Smith
Cal Harper
freddy Holt
smith john
frank waters
harper Cal
;
run;


data t;
set have;
array t(50) $1 _temporary_;
call missing(of t(*));
n=compress(upcase(name));
do _n_=1 to length(n);
t(_n_)=char(n,_n_);
end;
call sortc(of t(*));
w=cats(of t(*));
run;
proc sort data=t out=want(drop=w n) nodupkey;
by w;
run; &lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 05 Nov 2018 19:44:03 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/deleting-mixed-duplicates/m-p/510574#M137411</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2018-11-05T19:44:03Z</dc:date>
    </item>
    <item>
      <title>Re: deleting mixed duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/deleting-mixed-duplicates/m-p/510577#M137412</link>
      <description>BINGO !!&lt;BR /&gt;Awesome and thank you so very much !!</description>
      <pubDate>Mon, 05 Nov 2018 19:47:07 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/deleting-mixed-duplicates/m-p/510577#M137412</guid>
      <dc:creator>jfaruqui</dc:creator>
      <dc:date>2018-11-05T19:47:07Z</dc:date>
    </item>
    <item>
      <title>Re: deleting duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/deleting-mixed-duplicates/m-p/510605#M137425</link>
      <description>Since this was answered in your other thread, I'm going to merge them.</description>
      <pubDate>Mon, 05 Nov 2018 21:22:24 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/deleting-mixed-duplicates/m-p/510605#M137425</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2018-11-05T21:22:24Z</dc:date>
    </item>
    <item>
      <title>Re: deleting mixed duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/deleting-mixed-duplicates/m-p/510648#M137433</link>
      <description>&lt;P&gt;Minor variant of &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/138205"&gt;@novinosrin&lt;/a&gt;'s&amp;nbsp;solution (sorting the &lt;EM&gt;names&lt;/EM&gt; rather than the &lt;EM&gt;letters&lt;/EM&gt; contained in the names) and more test data:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
input name $50.;
cards;
Sylvie Elpers
Lissy Pleever
Elvis Presley
Lily Spreeves
Presley Elvis
;

data t;
set have;
array t(5) $25 _temporary_;
length w $50;
call missing(of t(*));
n=compbl(upcase(name));
do _n_=1 to countw(n,' ');
  t(_n_)=scan(n,_n_,' ');
end;
call sortc(of t(*));
w=catx(' ', of t(*));
run;

proc sort data=t out=want(drop=w n) nodupkey;
by w;
run; &lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 05 Nov 2018 23:53:57 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/deleting-mixed-duplicates/m-p/510648#M137433</guid>
      <dc:creator>FreelanceReinh</dc:creator>
      <dc:date>2018-11-05T23:53:57Z</dc:date>
    </item>
    <item>
      <title>Re: deleting mixed duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/deleting-mixed-duplicates/m-p/510707#M137448</link>
      <description>As a matter of fact this solution is more relevant because, as you rightly demonstrated above, different unique names can still contain the same letters .. if we sort by letters then the entire dataset will be eliminated except 1 name !! So yes, this solution is more robust</description>
      <pubDate>Tue, 06 Nov 2018 09:04:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/deleting-mixed-duplicates/m-p/510707#M137448</guid>
      <dc:creator>jfaruqui</dc:creator>
      <dc:date>2018-11-06T09:04:59Z</dc:date>
    </item>
    <item>
      <title>Re: deleting mixed duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/deleting-mixed-duplicates/m-p/515596#M139142</link>
      <description>&lt;P&gt;Hello..&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am pretty new to SAS.. I found solution good for most of cases but it&amp;nbsp;will fail for the scenarios where two different names have same set of characters like mentioned below..&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Example:&lt;/STRONG&gt;&lt;BR /&gt;Raam Rahim&lt;BR /&gt;Ram Rahima&lt;BR /&gt;Rahim Raam&lt;BR /&gt;Rama Rahim&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Output should be 3 names but code is generating just one output so i don't think so we can use character sorting to compare and probably we need to compare word by word.&lt;/P&gt;</description>
      <pubDate>Fri, 23 Nov 2018 16:28:11 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/deleting-mixed-duplicates/m-p/515596#M139142</guid>
      <dc:creator>goravnet</dc:creator>
      <dc:date>2018-11-23T16:28:11Z</dc:date>
    </item>
  </channel>
</rss>

