<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic End of file &amp; hash / finding duplicate records in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/End-of-file-hash-finding-duplicate-records/m-p/183926#M303443</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;&lt;BR /&gt;Hello!&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I would like to filter the duplicate records of data set A, this is I would like to kick out observations with Nr &amp;gt;=2 -&amp;nbsp; If (EoF) &amp;amp; (Nr ge 2) Then H.Output(Dataset:"Duplicates"); doesn't work.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The following program works but keeps single observations:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Data A;&lt;BR /&gt;&amp;nbsp; Do i=1 To 30;&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; ID=Byte(Int(RanUni(1)*26)+65);&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; Output;&lt;BR /&gt;&amp;nbsp; End;&lt;BR /&gt;Run;&lt;/P&gt;&lt;P&gt;Data _NULL_;&lt;BR /&gt;&amp;nbsp; Length Nr 3.;&lt;BR /&gt;&amp;nbsp; If _N_ eq 1 Then Do;&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; Declare Hash H();&lt;BR /&gt; H.DefineKey("ID");&lt;BR /&gt; H.DefineData("ID", "i", "Nr");&lt;BR /&gt; H.DefineDone();&lt;BR /&gt;&amp;nbsp; End;&lt;BR /&gt;&amp;nbsp; Set A End=EoF;&lt;BR /&gt;&amp;nbsp; If H.Find() ne 0 Then Do;&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; Nr=1;&lt;BR /&gt; H.Add();&lt;BR /&gt;&amp;nbsp; End;&lt;BR /&gt;&amp;nbsp; Else Do;&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; Nr+1;&lt;BR /&gt; H.Replace();&lt;BR /&gt;&amp;nbsp; End;&lt;BR /&gt;&amp;nbsp; If EoF Then H.Output(Dataset:"Duplicates");&lt;BR /&gt;Run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;My 2nd question is, how can I find duplicate records (not count them) of dataset "A" using a hash object?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks&amp;amp;kind regards&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Mon, 24 Nov 2014 16:21:11 GMT</pubDate>
    <dc:creator>user24feb</dc:creator>
    <dc:date>2014-11-24T16:21:11Z</dc:date>
    <item>
      <title>End of file &amp; hash / finding duplicate records</title>
      <link>https://communities.sas.com/t5/SAS-Programming/End-of-file-hash-finding-duplicate-records/m-p/183926#M303443</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;&lt;BR /&gt;Hello!&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I would like to filter the duplicate records of data set A, this is I would like to kick out observations with Nr &amp;gt;=2 -&amp;nbsp; If (EoF) &amp;amp; (Nr ge 2) Then H.Output(Dataset:"Duplicates"); doesn't work.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The following program works but keeps single observations:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Data A;&lt;BR /&gt;&amp;nbsp; Do i=1 To 30;&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; ID=Byte(Int(RanUni(1)*26)+65);&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; Output;&lt;BR /&gt;&amp;nbsp; End;&lt;BR /&gt;Run;&lt;/P&gt;&lt;P&gt;Data _NULL_;&lt;BR /&gt;&amp;nbsp; Length Nr 3.;&lt;BR /&gt;&amp;nbsp; If _N_ eq 1 Then Do;&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; Declare Hash H();&lt;BR /&gt; H.DefineKey("ID");&lt;BR /&gt; H.DefineData("ID", "i", "Nr");&lt;BR /&gt; H.DefineDone();&lt;BR /&gt;&amp;nbsp; End;&lt;BR /&gt;&amp;nbsp; Set A End=EoF;&lt;BR /&gt;&amp;nbsp; If H.Find() ne 0 Then Do;&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; Nr=1;&lt;BR /&gt; H.Add();&lt;BR /&gt;&amp;nbsp; End;&lt;BR /&gt;&amp;nbsp; Else Do;&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; Nr+1;&lt;BR /&gt; H.Replace();&lt;BR /&gt;&amp;nbsp; End;&lt;BR /&gt;&amp;nbsp; If EoF Then H.Output(Dataset:"Duplicates");&lt;BR /&gt;Run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;My 2nd question is, how can I find duplicate records (not count them) of dataset "A" using a hash object?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks&amp;amp;kind regards&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 24 Nov 2014 16:21:11 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/End-of-file-hash-finding-duplicate-records/m-p/183926#M303443</guid>
      <dc:creator>user24feb</dc:creator>
      <dc:date>2014-11-24T16:21:11Z</dc:date>
    </item>
    <item>
      <title>Re: End of file &amp; hash / finding duplicate records</title>
      <link>https://communities.sas.com/t5/SAS-Programming/End-of-file-hash-finding-duplicate-records/m-p/183927#M303444</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;If I understood what you mean.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;


 
Data A;
&amp;nbsp; Do i=1 To 30;
&amp;nbsp;&amp;nbsp;&amp;nbsp; ID=Byte(Int(RanUni(1)*26)+65);
&amp;nbsp;&amp;nbsp;&amp;nbsp; Output;
&amp;nbsp; End;
Run;
data _null_;
 if _n_ eq 1 then do;
&amp;nbsp; if 0 then set a;
&amp;nbsp; declare hash h();
&amp;nbsp; h.definekey('id');
&amp;nbsp; h.definedata('id','n');
&amp;nbsp; h.definedone();
end;
set a end=last;
if h.find()=0 then do;n+1;h.replace();end;
 else do;n=1;h.replace();end;
if last then do;
 h.output(dataset:'singual(where=(n=1))');
 h.output(dataset:'duplicate(where=(n gt 1))');
end;
run;

&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Xia Keshan&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 25 Nov 2014 13:34:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/End-of-file-hash-finding-duplicate-records/m-p/183927#M303444</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2014-11-25T13:34:37Z</dc:date>
    </item>
    <item>
      <title>Re: End of file &amp; hash / finding duplicate records</title>
      <link>https://communities.sas.com/t5/SAS-Programming/End-of-file-hash-finding-duplicate-records/m-p/183928#M303445</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Yes, that's exactly what a meant. I didn't think to put a where-statement after the dataset. Many thanks!&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 25 Nov 2014 13:52:26 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/End-of-file-hash-finding-duplicate-records/m-p/183928#M303445</guid>
      <dc:creator>user24feb</dc:creator>
      <dc:date>2014-11-25T13:52:26Z</dc:date>
    </item>
  </channel>
</rss>

