<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Delete duplicate obs with HASH based on three or more variables and on some condition in SAS Procedures</title>
    <link>https://communities.sas.com/t5/SAS-Procedures/Delete-duplicate-obs-with-HASH-based-on-three-or-more-variables/m-p/234799#M54975</link>
    <description>OK you can try this . First remove the duplicates using ID and drop IND1 QUANT1  QUANT2.&lt;BR /&gt;Create a 2nd data set and remove all the values of D and Keep C and merge this 2 data set.&lt;BR /&gt;&lt;BR /&gt;Reg&lt;BR /&gt;KD</description>
    <pubDate>Sun, 15 Nov 2015 10:34:59 GMT</pubDate>
    <dc:creator>pearsoninst</dc:creator>
    <dc:date>2015-11-15T10:34:59Z</dc:date>
    <item>
      <title>Delete duplicate obs with HASH based on three or more variables and on some condition</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Delete-duplicate-obs-with-HASH-based-on-three-or-more-variables/m-p/234793#M54973</link>
      <description>&lt;P&gt;Dear all,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I hope you are well.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I have a really huge dataset wich is NOT sorted (sorting would take ages if not failed) and I need to find and remove duplicate based on three &lt;U&gt;or even more&lt;/U&gt; variables.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;ID &amp;nbsp; &amp;nbsp; Var1 Var2 &amp;nbsp;IND1 QUANT1 QUANT2&amp;nbsp;&lt;/P&gt;
&lt;P&gt;100 &amp;nbsp;20 &amp;nbsp; &amp;nbsp; &amp;nbsp;15 &amp;nbsp; &amp;nbsp; D &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;234 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;5678&lt;/P&gt;
&lt;P&gt;200 &amp;nbsp;14 &amp;nbsp; &amp;nbsp; &amp;nbsp;12 &amp;nbsp; &amp;nbsp; &lt;STRONG&gt;C &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&amp;nbsp;&lt;/STRONG&gt;689 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;1567&lt;/P&gt;
&lt;P&gt;100 &amp;nbsp;20 &amp;nbsp; &amp;nbsp; &amp;nbsp;15 &amp;nbsp; &amp;nbsp; &lt;STRONG&gt;C &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&lt;/STRONG&gt; 567 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;489&lt;/P&gt;
&lt;P&gt;300 &amp;nbsp;12 &amp;nbsp; &amp;nbsp; &amp;nbsp;11 &amp;nbsp; &amp;nbsp; M &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;7865 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;9890&amp;nbsp;&lt;/P&gt;
&lt;P&gt;200 &amp;nbsp;14 &amp;nbsp; &amp;nbsp; &amp;nbsp;12 &amp;nbsp; &amp;nbsp; D &amp;nbsp; &amp;nbsp; &amp;nbsp; 6476 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;5763&lt;/P&gt;
&lt;P&gt;200 &amp;nbsp; 55 &amp;nbsp; &amp;nbsp; 10 &amp;nbsp; &amp;nbsp; M &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 545 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 3434&lt;/P&gt;
&lt;P&gt;200 &amp;nbsp; 14 &amp;nbsp; &amp;nbsp; 12 &amp;nbsp; &amp;nbsp;S &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;1687 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 3323&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;1.--&amp;nbsp;In fact I need to apply a "NODUPKEY" situation i.e. &lt;U&gt;keep one of the duplicate records&lt;/U&gt; based on the three or more variable.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;2.-- Just before doing that I need to apply a condition, if one of the duplicate records has IND1 = C and there is another duplicate record with ID = D then replace QUANT1 &amp;amp; QUANT2 values of the record with IND = D with the corresponding ones of the record with IND = C, then delete record with IND =D and any other duplicate records based on the same combination of the three vars&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;My WANT data file would look like this&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;ID &amp;nbsp; &amp;nbsp; Var1 Var2 &amp;nbsp;IND1 QUANT1 &amp;nbsp; &amp;nbsp; QUANT2&amp;nbsp;&lt;/P&gt;
&lt;P&gt;100 &amp;nbsp;20 &amp;nbsp; &amp;nbsp; &amp;nbsp;15 &amp;nbsp; &amp;nbsp; D &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&lt;STRIKE&gt;234&lt;/STRIKE&gt;&amp;nbsp; &lt;STRONG&gt;567 &amp;nbsp; &amp;nbsp; &amp;nbsp;&amp;nbsp;&lt;/STRONG&gt;&lt;STRIKE&gt;5678&lt;/STRIKE&gt; &amp;nbsp;&amp;nbsp;&lt;STRONG&gt;489&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="line-height: 20px;"&gt;300 &amp;nbsp;12 &amp;nbsp; &amp;nbsp; &amp;nbsp;11 &amp;nbsp; &amp;nbsp; M &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;7865 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 9890&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;200 &amp;nbsp;14 &amp;nbsp; &amp;nbsp; &amp;nbsp;12 &amp;nbsp; &amp;nbsp; D &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;STRIKE&gt;6476&lt;/STRIKE&gt;&amp;nbsp; &amp;nbsp;&lt;STRONG&gt;689&lt;/STRONG&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&amp;nbsp;&lt;STRIKE&gt;5763&lt;/STRIKE&gt;&amp;nbsp; &lt;STRONG&gt;1567&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;200 &amp;nbsp; 55 &amp;nbsp; &amp;nbsp; 10 &amp;nbsp; &amp;nbsp; M &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 545 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;3434&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Thank you in advance&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Best regards&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Nik&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 15 Nov 2015 05:53:26 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Delete-duplicate-obs-with-HASH-based-on-three-or-more-variables/m-p/234793#M54973</guid>
      <dc:creator>Nikos</dc:creator>
      <dc:date>2015-11-15T05:53:26Z</dc:date>
    </item>
    <item>
      <title>Re: Delete duplicate obs with HASH based on three or more variables and on some condition</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Delete-duplicate-obs-with-HASH-based-on-three-or-more-variables/m-p/234794#M54974</link>
      <description>&lt;P&gt;and the most important I would like to use &lt;STRONG&gt;HASH &lt;/STRONG&gt;since my dataset is huge (i.e. millions of records)&lt;/P&gt;</description>
      <pubDate>Sun, 15 Nov 2015 05:55:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Delete-duplicate-obs-with-HASH-based-on-three-or-more-variables/m-p/234794#M54974</guid>
      <dc:creator>Nikos</dc:creator>
      <dc:date>2015-11-15T05:55:33Z</dc:date>
    </item>
    <item>
      <title>Re: Delete duplicate obs with HASH based on three or more variables and on some condition</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Delete-duplicate-obs-with-HASH-based-on-three-or-more-variables/m-p/234799#M54975</link>
      <description>OK you can try this . First remove the duplicates using ID and drop IND1 QUANT1  QUANT2.&lt;BR /&gt;Create a 2nd data set and remove all the values of D and Keep C and merge this 2 data set.&lt;BR /&gt;&lt;BR /&gt;Reg&lt;BR /&gt;KD</description>
      <pubDate>Sun, 15 Nov 2015 10:34:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Delete-duplicate-obs-with-HASH-based-on-three-or-more-variables/m-p/234799#M54975</guid>
      <dc:creator>pearsoninst</dc:creator>
      <dc:date>2015-11-15T10:34:59Z</dc:date>
    </item>
    <item>
      <title>Re: Delete duplicate obs with HASH based on three or more variables and on some condition</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Delete-duplicate-obs-with-HASH-based-on-three-or-more-variables/m-p/235069#M55006</link>
      <description>&lt;P&gt;Thank you but the datasets are really large and sorting them would require a lot of resources.&lt;/P&gt;
&lt;P&gt;I am looking for a HASH alternative due to its speediness.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;BR&lt;/P&gt;</description>
      <pubDate>Tue, 17 Nov 2015 17:06:36 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Delete-duplicate-obs-with-HASH-based-on-three-or-more-variables/m-p/235069#M55006</guid>
      <dc:creator>Nikos</dc:creator>
      <dc:date>2015-11-17T17:06:36Z</dc:date>
    </item>
  </channel>
</rss>

