<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Fastest way to delete rows from a dataset by key in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Fastest-way-to-delete-rows-from-a-dataset-by-key/m-p/279872#M269694</link>
    <description>&lt;P&gt;&amp;nbsp;@Ksharp Unless the 1) data is partitioned in different I/O sub-systems and 2) DS2 uses threads, DS2 will not be faster, will it?&lt;/P&gt;</description>
    <pubDate>Thu, 23 Jun 2016 23:13:51 GMT</pubDate>
    <dc:creator>ChrisNZ</dc:creator>
    <dc:date>2016-06-23T23:13:51Z</dc:date>
    <item>
      <title>Fastest way to delete rows from a dataset by key</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Fastest-way-to-delete-rows-from-a-dataset-by-key/m-p/277936#M269669</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I have a huge dataset (let's name it HUGE). HUGE has:&lt;/P&gt;
&lt;P&gt;-&amp;nbsp;500 million rows&lt;/P&gt;
&lt;P&gt;- a composite unique primary key, let's name it (K1,K2) (two columns&amp;nbsp;belong to the key)&lt;/P&gt;
&lt;P&gt;- some other columns in addition to the key&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Next&amp;nbsp;I have a small dataset (let's name it SMALL). SMALL has:&lt;/P&gt;
&lt;P&gt;- 200k rows&lt;/P&gt;
&lt;P&gt;- exactly the same&amp;nbsp;composite unique primary key (K1,K2)&lt;/P&gt;
&lt;P&gt;-&amp;nbsp;no more columns in addition to the key.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I.e.:&lt;/P&gt;
&lt;P&gt;- HUGE: K1 | K2 | FIELD1 | ANOTHERFIELD2 | SOMEOTHERFIELD3 | ETC, UPK(K1,K2)&lt;/P&gt;
&lt;P&gt;- SMALL: K1 | K2, UPK(K1,K2)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I need to delete from HUGE all the rows whose&amp;nbsp;key appears in SMALL, so that after this process HUGE and SMALL&amp;nbsp;do not share any common key.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I first tried with PROC SQL and DELETE FROM, but I got crazy about how to use the right syntax... on a dbms I'd have done:&lt;/P&gt;
&lt;P&gt;DELETE FROM HUGE WHERE (K1,K2) NOT IN (SELECT K1,K2 FROM SMALL)&lt;/P&gt;
&lt;P&gt;but SAS does not allow this syntax.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Next I tried with this:&lt;/P&gt;
&lt;P&gt;data HUGE;&lt;BR /&gt; merge HUGE(in=in1) SMALL(in=in2); by K1 K2;&lt;BR /&gt; if (in1 and in2) then delete;&lt;BR /&gt; run;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This works, but it takes almost a day to complete due to HUGE size.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;What would you recommend?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks a lot&lt;/P&gt;
&lt;P&gt;Regards&lt;/P&gt;</description>
      <pubDate>Thu, 16 Jun 2016 15:33:03 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Fastest-way-to-delete-rows-from-a-dataset-by-key/m-p/277936#M269669</guid>
      <dc:creator>Edoedoedo</dc:creator>
      <dc:date>2016-06-16T15:33:03Z</dc:date>
    </item>
    <item>
      <title>Re: Fastest way to delete rows from a dataset by key</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Fastest-way-to-delete-rows-from-a-dataset-by-key/m-p/277951#M269670</link>
      <description>&lt;P&gt;Try a Hash Solution. The SMALL goes to Hash table. The HUGE can be processed one record after another. The Matchin record will be ignored.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;data want;&lt;BR /&gt;if _n_ = 1 then do;&lt;BR /&gt;if 0 then set SMALL;&lt;BR /&gt;declare hash h(dataset:'SMALL');&lt;BR /&gt;h.definekey('K1', 'K2');&lt;BR /&gt;h.definedone();&lt;BR /&gt;end;&lt;BR /&gt;set HUGE;&lt;BR /&gt;if h.find() ^= 0;&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 16 Jun 2016 16:08:43 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Fastest-way-to-delete-rows-from-a-dataset-by-key/m-p/277951#M269670</guid>
      <dc:creator>KachiM</dc:creator>
      <dc:date>2016-06-16T16:08:43Z</dc:date>
    </item>
    <item>
      <title>Re: Fastest way to delete rows from a dataset by key</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Fastest-way-to-delete-rows-from-a-dataset-by-key/m-p/277974#M269671</link>
      <description>&lt;P&gt;It's hard to know what will be optimized, beforehand. Try :&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;DELETE FROM HUGE as H
WHERE exists (select K1, K2 from SMALL where K1=H.K1 and K2=H.K2);&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 16 Jun 2016 17:23:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Fastest-way-to-delete-rows-from-a-dataset-by-key/m-p/277974#M269671</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2016-06-16T17:23:39Z</dc:date>
    </item>
    <item>
      <title>Re: Fastest way to delete rows from a dataset by key</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Fastest-way-to-delete-rows-from-a-dataset-by-key/m-p/278231#M269672</link>
      <description>Hey &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/17813"&gt;@KachiM&lt;/a&gt;, because hash methods can be quick in some situations sue to in memory techniques, it doesn't mean that us suitable for any problem.&lt;BR /&gt;The trick is not to access the whole table, that's why we invented indexes. &lt;BR /&gt;I blame &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/18408"&gt;@Ksharp&lt;/a&gt; for this overuse of hashing &lt;span class="lia-unicode-emoji" title=":grinning_squinting_face:"&gt;😆&lt;/span&gt;</description>
      <pubDate>Fri, 17 Jun 2016 15:50:43 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Fastest-way-to-delete-rows-from-a-dataset-by-key/m-p/278231#M269672</guid>
      <dc:creator>LinusH</dc:creator>
      <dc:date>2016-06-17T15:50:43Z</dc:date>
    </item>
    <item>
      <title>Re: Fastest way to delete rows from a dataset by key</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Fastest-way-to-delete-rows-from-a-dataset-by-key/m-p/278237#M269673</link>
      <description>Usually in data warehouse you don't delete any records, just mark certain records as not valid any more, perhaps using end dates.&lt;BR /&gt;&lt;BR /&gt;That said, an alternative to SQL DELETE FROM  (if you can't get the speed up in that scenario) you should be able to use MODIFY with REMOVE in a data step. I never used it but should be quite fast since it will use your index.</description>
      <pubDate>Fri, 17 Jun 2016 16:13:49 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Fastest-way-to-delete-rows-from-a-dataset-by-key/m-p/278237#M269673</guid>
      <dc:creator>LinusH</dc:creator>
      <dc:date>2016-06-17T16:13:49Z</dc:date>
    </item>
    <item>
      <title>Re: Fastest way to delete rows from a dataset by key</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Fastest-way-to-delete-rows-from-a-dataset-by-key/m-p/278248#M269674</link>
      <description>&lt;P&gt;Hi LinusH:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I do not understand your comment.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;To me, Hash, Array or any other SAS way to solve a problem hinges on either minimizing run-time or memory or both. The data type of K1 and K2 is not known, I used Hash Object. If they were NUMBERs, I would have solved the problem by using Array.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I will appreciate the superiority of a method must be empirically verified before offering comments.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Regards&lt;/P&gt;</description>
      <pubDate>Fri, 17 Jun 2016 17:17:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Fastest-way-to-delete-rows-from-a-dataset-by-key/m-p/278248#M269674</guid>
      <dc:creator>KachiM</dc:creator>
      <dc:date>2016-06-17T17:17:58Z</dc:date>
    </item>
    <item>
      <title>Re: Fastest way to delete rows from a dataset by key</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Fastest-way-to-delete-rows-from-a-dataset-by-key/m-p/278256#M269675</link>
      <description>&lt;P&gt;It's not always about minimizing&amp;nbsp;"&lt;SPAN&gt;run-time or memory or both" but often about minimysing data transfer times. A hash solution will always require the transfer of whole table(s) into memory, and that can take a long time. Which is what&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13674"&gt;@LinusH﻿&lt;/a&gt;&amp;nbsp;was talking about. Optimal procedures will be the best compromise within the constraints set by CPU, memory and network resources.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 17 Jun 2016 17:59:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Fastest-way-to-delete-rows-from-a-dataset-by-key/m-p/278256#M269675</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2016-06-17T17:59:13Z</dc:date>
    </item>
    <item>
      <title>Re: Fastest way to delete rows from a dataset by key</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Fastest-way-to-delete-rows-from-a-dataset-by-key/m-p/278262#M269676</link>
      <description>&lt;P&gt;I must have mentioned I/O time in addtion to Run-time.&lt;/P&gt;</description>
      <pubDate>Fri, 17 Jun 2016 18:09:23 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Fastest-way-to-delete-rows-from-a-dataset-by-key/m-p/278262#M269676</guid>
      <dc:creator>KachiM</dc:creator>
      <dc:date>2016-06-17T18:09:23Z</dc:date>
    </item>
    <item>
      <title>Re: Fastest way to delete rows from a dataset by key</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Fastest-way-to-delete-rows-from-a-dataset-by-key/m-p/278340#M269677</link>
      <description>&lt;PRE&gt;
@LinusH
I am enjoining use Hash Table. It seems it is my last resort when I face a tough question and don't know what to do.
I think index should be retired .   *^_^* 


&lt;/PRE&gt;</description>
      <pubDate>Sat, 18 Jun 2016 06:53:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Fastest-way-to-delete-rows-from-a-dataset-by-key/m-p/278340#M269677</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2016-06-18T06:53:39Z</dc:date>
    </item>
    <item>
      <title>Re: Fastest way to delete rows from a dataset by key</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Fastest-way-to-delete-rows-from-a-dataset-by-key/m-p/278555#M269678</link>
      <description>&lt;P&gt;As mentioned (sorry if I repeat here&amp;nbsp;some of the valid points made before), the key for speed for this type of operation is not to read the whole table and write it out again.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;So no&lt;/P&gt;
&lt;P&gt;proc sql ; create ...&lt;/P&gt;
&lt;P&gt;or&lt;/P&gt;
&lt;P&gt;data HUGE; set/merge HUGE;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Instead modifying the table in place is the way to go.&lt;/P&gt;
&lt;P&gt;The data step's &lt;STRONG&gt;modify&lt;/STRONG&gt; statement and proc sql's &lt;STRONG&gt;delete&lt;/STRONG&gt; statements are prime candidates for this. They allow "soft deletion" where the observation is still there (hence no rewritting the whole&amp;nbsp;data set) but&amp;nbsp;marked as&amp;nbsp;deleted. proc contents shows the deleted observations.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Sadly, proc sql doesn't use the index for the syntax that &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/462"&gt;@PGStats&lt;/a&gt; proposes (I can't find a better way, any other proposal?), but the modify statement does and hence is very fast.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data HUGE(index=(A=(I J))) ;
  array A [50];
  do I=1 to 50e6; 
    J=I;
    output;
  end;
run;
data SMALL;
  do I=1 to 50e6; 
    J=I;
    if ranuni(0)&amp;gt; .9999 then output;
  end;
run;

options msglevel=i;

%* MODIFY statement: 0.5 seconds;
data HUGE;
  set SMALL ;
  modify HUGE key=A;
  if _iorc_=%sysrc(_sok) then remove;
  else _error_=0;
run;
proc contents data=HUGE; 
run;

%* MERGE statement: 110 seconds;
data HUGE;
  merge HUGE SMALL(in=BAD);
  by I J;
  if not BAD;
run;
 
%* DELETE SQL statement: hours?;
proc sql;
   delete from HUGE as H
   where exists (select I, J from SMALL where I=H.I and J=H.J);
quit;

proc contents data=HUGE; 
run;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;More benchmarks and&amp;nbsp;discussions about performance in &lt;A href="https://www.amazon.com/High-Performance-SAS-Coding-Christian-Graffeuille/dp/1512397490" target="_blank"&gt;https://www.amazon.com/High-Performance-SAS-Coding-Christian-Graffeuille/dp/1512397490&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 20 Jun 2016 00:14:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Fastest-way-to-delete-rows-from-a-dataset-by-key/m-p/278555#M269678</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2016-06-20T00:14:34Z</dc:date>
    </item>
    <item>
      <title>Re: Fastest way to delete rows from a dataset by key</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Fastest-way-to-delete-rows-from-a-dataset-by-key/m-p/278568#M269679</link>
      <description>&lt;P&gt;If table HUGE has a single field UNIQUE index, it seems you can give proc SQL a decent performance by translating your two-fields key into the single key and use IN instead of EXISTS. The only reason I see for going to the trouble of doing this is the rollback ability of SQL which isn't provided by datastep MERGE or MODIFY operations. I modified&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/16961"&gt;@ChrisNZ﻿&lt;/a&gt;&amp;nbsp;test code as follows&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;options msglevel=i;

data HUGE(index=(A=(I J) ID/unique)) ;
  array A [50];
  do I=1 to 10e6; 
    J=I;
    ID + 1;
    output;
  end;
run;

data SMALL;
  do I=1 to 10e6; 
    J=I;
    if ranuni(0)&amp;gt; .9999 then output;
  end;
run;

proc sql /* undo_policy=none */ ;
/* Optimized with index A */
create table IDS as 
select H.ID 
from HUGE as H inner join SMALL as S on H.i=S.i and H.j=S.j; 
/* No mention of index ID, but runs in a couple of minutes */
delete from HUGE as H
where ID in (select ID from IDS);
quit; &lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;in my tests, adding undo_policy=none didn't improve performance.&lt;/P&gt;</description>
      <pubDate>Mon, 20 Jun 2016 03:31:36 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Fastest-way-to-delete-rows-from-a-dataset-by-key/m-p/278568#M269679</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2016-06-20T03:31:36Z</dc:date>
    </item>
    <item>
      <title>Re: Fastest way to delete rows from a dataset by key</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Fastest-way-to-delete-rows-from-a-dataset-by-key/m-p/278857#M269680</link>
      <description>&lt;P&gt;&amp;nbsp;@PGStats SAS did&amp;nbsp;not use the ID index in your example.&lt;/P&gt;
&lt;P&gt;Even using the idxwhere= option doesn't trigger the index.&lt;/P&gt;
&lt;P&gt;A fixed list does though. I reported this issue long ago and it seems it is still here.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data HUGE1(index=(A=(I J) ID/unique)) 
     HUGE2(index=(A=(I J) ID/unique)) 
     SMALL(keep=I J);
  array A [50];
  do I=1 to 10e6; 
    J=I;
    ID + 1;
    if ranuni(0)&amp;gt; .9999 then output SMALL;
    output HUGE1 HUGE2;
  end;
run;

proc sql noprint; /* Optimized with index A */
create table IDS as 
  select H.ID 
  from HUGE1 as H inner join SMALL as S 
  on H.i=S.i and H.j=S.j; 
select ID into :ids separated by ',' from IDS;
quit;

proc sql;  /* No index used: 11 seconds */
delete from HUGE1(idxwhere=yes) where ID in (select ID from IDS);
quit;

proc sql; /* index ID used: 0.2 seconds */
delete from HUGE2 where ID in (&amp;amp;IDS);
quit; &lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The modify statement looks like the easiest way to speed up this job.&lt;/P&gt;
&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/5590"&gt;@Edoedoedo﻿&lt;/a&gt; Did this work for you?&lt;/P&gt;</description>
      <pubDate>Mon, 20 Jun 2016 22:13:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Fastest-way-to-delete-rows-from-a-dataset-by-key/m-p/278857#M269680</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2016-06-20T22:13:41Z</dc:date>
    </item>
    <item>
      <title>Re: Fastest way to delete rows from a dataset by key</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Fastest-way-to-delete-rows-from-a-dataset-by-key/m-p/279234#M269681</link>
      <description>&lt;P&gt;Hi guys,&lt;/P&gt;
&lt;P&gt;I tried all your suggestions, here are the results. (HUGE is 70GB, SMALL is 10MB)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;HASH&amp;nbsp;solution, by&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/17813"&gt;@KachiM﻿&lt;/a&gt;: amazing, it does all the job in 7 minutes, including&amp;nbsp;CPU time and I/O time (writing 70GB on&amp;nbsp;disk takes time, just doing "cp HUGE HUGE2" takes 3 minutes)&lt;BR /&gt;Pro: fastest solution&lt;BR /&gt;Cons:&amp;nbsp;since it creates a new table, it wastes I/O time, needs double disk space, and needs to recreate the index after the operation&lt;BR /&gt;Comments: I didn't know hashes, so I read about them on documentation; in your opinion, do you think that loading both tables in memory would speed up more? Are there more sophisticated strategies to increase performance with hashes? Moreover, what happens if the system does not have enough ram to hold the tables, it just slows down or sas crashes?&lt;BR /&gt;&lt;BR /&gt;&lt;/LI&gt;
&lt;LI&gt;DELETE solution, by&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/462"&gt;@PGStats﻿&lt;/a&gt;: thank you for the correct syntax, it will be surely useful! However after 1 hour I aborted it.&lt;BR /&gt;Pro: Rollback available, no double space, no need to recreate the index&lt;BR /&gt;Cons: too much time&lt;BR /&gt;Comments:&amp;nbsp;too much time for production environment&lt;BR /&gt;&lt;BR /&gt;&lt;/LI&gt;
&lt;LI&gt;MODIFY solution, by&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/16961"&gt;@ChrisNZ﻿&lt;/a&gt;: quite good solution, it&amp;nbsp;takes about 20 minutes to complete&lt;BR /&gt;Pro: no double space, no need to recreate the index, quite fast&lt;BR /&gt;Cons: not as fast as HASH&lt;BR /&gt;&lt;BR /&gt;&lt;/LI&gt;
&lt;LI&gt;MODIFY with HASH solution, by Me:&lt;BR /&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data HUGE;
    modify HUGE;
    if _n_ = 1 then do;
        declare hash h(dataset:'SMALL');
        h.definekey('K1','K2');
        h.definedone();
    end;
    if h.find() ^= 0 then remove; 
run;&lt;/CODE&gt;&lt;/PRE&gt;
I taught I had a good idea mixing your solution, however after 1 hour I aborted it&amp;nbsp;&lt;img id="smileysad" class="emoticon emoticon-smileysad" src="https://communities.sas.com/i/smilies/16x16_smiley-sad.png" alt="Smiley Sad" title="Smiley Sad" /&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;So I think I'll use the first solution, any other comments and suggestions are wery welcome!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks a lot&lt;/P&gt;
&lt;P&gt;Regards&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 22 Jun 2016 08:32:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Fastest-way-to-delete-rows-from-a-dataset-by-key/m-p/279234#M269681</guid>
      <dc:creator>Edoedoedo</dc:creator>
      <dc:date>2016-06-22T08:32:41Z</dc:date>
    </item>
    <item>
      <title>Re: Fastest way to delete rows from a dataset by key</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Fastest-way-to-delete-rows-from-a-dataset-by-key/m-p/279238#M269682</link>
      <description>Use check() instead of find(). But I doubt it will make a huge difference. &lt;BR /&gt;Sadly or its not rare that sas is much faster copying a whole table than updating it in place. I guess it depends on the proportion of records to update. &lt;BR /&gt;Well done!</description>
      <pubDate>Wed, 22 Jun 2016 08:49:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Fastest-way-to-delete-rows-from-a-dataset-by-key/m-p/279238#M269682</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2016-06-22T08:49:35Z</dc:date>
    </item>
    <item>
      <title>Re: Fastest way to delete rows from a dataset by key</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Fastest-way-to-delete-rows-from-a-dataset-by-key/m-p/279241#M269683</link>
      <description>Also did you count the time recreating the index?</description>
      <pubDate>Wed, 22 Jun 2016 08:52:40 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Fastest-way-to-delete-rows-from-a-dataset-by-key/m-p/279241#M269683</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2016-06-22T08:52:40Z</dc:date>
    </item>
    <item>
      <title>Re: Fastest way to delete rows from a dataset by key</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Fastest-way-to-delete-rows-from-a-dataset-by-key/m-p/279242#M269684</link>
      <description>&lt;P&gt;It is best position for PROC DS2.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
 set sashelp.class;
run;


proc ds2;
data Male(overwrite=yes);
 method run();
  set have;
  if sex eq 'M' then output;
 end;
enddata;
run;
quit;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 22 Jun 2016 08:55:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Fastest-way-to-delete-rows-from-a-dataset-by-key/m-p/279242#M269684</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2016-06-22T08:55:46Z</dc:date>
    </item>
    <item>
      <title>Re: Fastest way to delete rows from a dataset by key</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Fastest-way-to-delete-rows-from-a-dataset-by-key/m-p/279243#M269685</link>
      <description>Yes, approximately 5 minutes</description>
      <pubDate>Wed, 22 Jun 2016 08:56:52 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Fastest-way-to-delete-rows-from-a-dataset-by-key/m-p/279243#M269685</guid>
      <dc:creator>Edoedoedo</dc:creator>
      <dc:date>2016-06-22T08:56:52Z</dc:date>
    </item>
    <item>
      <title>Re: Fastest way to delete rows from a dataset by key</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Fastest-way-to-delete-rows-from-a-dataset-by-key/m-p/279244#M269686</link>
      <description>&lt;P&gt;Use RETURN instead of DELETE of data step.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
 set sashelp.class;
run;


proc ds2;
data Non_male(overwrite=yes);
 method run();
  set have;
  if sex eq 'M' then return;
 end;
enddata;
run;
quit;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 22 Jun 2016 08:58:40 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Fastest-way-to-delete-rows-from-a-dataset-by-key/m-p/279244#M269686</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2016-06-22T08:58:40Z</dc:date>
    </item>
    <item>
      <title>Re: Fastest way to delete rows from a dataset by key</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Fastest-way-to-delete-rows-from-a-dataset-by-key/m-p/279246#M269687</link>
      <description>&lt;P&gt;Since you are now doing a sequential copy (at which sas excels) read my comments in &lt;A href="https://communities.sas.com/t5/Base-SAS-Programming/How-to-speed-up-performance-of-Sas-9-4-TS1M3-through-enterprise/m-p/276653/" target="_blank"&gt;https://communities.sas.com/t5/Base-SAS-Programming/How-to-speed-up-performance-of-Sas-9-4-TS1M3-through-enterprise/m-p/276653/&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 22 Jun 2016 09:10:25 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Fastest-way-to-delete-rows-from-a-dataset-by-key/m-p/279246#M269687</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2016-06-22T09:10:25Z</dc:date>
    </item>
    <item>
      <title>Re: Fastest way to delete rows from a dataset by key</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Fastest-way-to-delete-rows-from-a-dataset-by-key/m-p/279248#M269688</link>
      <description>So 5 minutes out of 7 are spent recreating the index? really?</description>
      <pubDate>Wed, 22 Jun 2016 09:01:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Fastest-way-to-delete-rows-from-a-dataset-by-key/m-p/279248#M269688</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2016-06-22T09:01:56Z</dc:date>
    </item>
  </channel>
</rss>

