<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Optimization of processing huge data sets in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Optimization-of-processing-huge-data-sets/m-p/7671#M114</link>
    <description>any feedback from option msglevel= i ;   ?&lt;BR /&gt;
It would clarify whether that where clause is implemented using the index.&lt;BR /&gt;
&lt;BR /&gt;
have you tried proc sql delete statement ?</description>
    <pubDate>Tue, 25 Mar 2008 17:04:31 GMT</pubDate>
    <dc:creator>deleted_user</dc:creator>
    <dc:date>2008-03-25T17:04:31Z</dc:date>
    <item>
      <title>Optimization of processing huge data sets</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Optimization-of-processing-huge-data-sets/m-p/7670#M113</link>
      <description>Dear SAS experts,&lt;BR /&gt;
&lt;BR /&gt;
I faced a processing time problem concerning removal of small subset of rows from a data set which contains ~ 200 bill. of rows.&lt;BR /&gt;
&lt;BR /&gt;
The data set contains indexes for the key variables which are used in WHERE expression to qualify the subset of data to be deleted from the data set.&lt;BR /&gt;
&lt;BR /&gt;
To rapidly delete the subset, I use a data step:&lt;BR /&gt;
&lt;BR /&gt;
data DSName;&lt;BR /&gt;
modify DSName(where=(expression...));&lt;BR /&gt;
remove;&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
I think it still takes too much time. Could you please give an idea of faster removal of small subset of data from data set.&lt;BR /&gt;
&lt;BR /&gt;
Thanks a lot.&lt;BR /&gt;
Sarunas.</description>
      <pubDate>Tue, 25 Mar 2008 16:05:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Optimization-of-processing-huge-data-sets/m-p/7670#M113</guid>
      <dc:creator>deleted_user</dc:creator>
      <dc:date>2008-03-25T16:05:17Z</dc:date>
    </item>
    <item>
      <title>Re: Optimization of processing huge data sets</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Optimization-of-processing-huge-data-sets/m-p/7671#M114</link>
      <description>any feedback from option msglevel= i ;   ?&lt;BR /&gt;
It would clarify whether that where clause is implemented using the index.&lt;BR /&gt;
&lt;BR /&gt;
have you tried proc sql delete statement ?</description>
      <pubDate>Tue, 25 Mar 2008 17:04:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Optimization-of-processing-huge-data-sets/m-p/7671#M114</guid>
      <dc:creator>deleted_user</dc:creator>
      <dc:date>2008-03-25T17:04:31Z</dc:date>
    </item>
    <item>
      <title>Re: Optimization of processing huge data sets</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Optimization-of-processing-huge-data-sets/m-p/7672#M115</link>
      <description>For a table with that many rows, have you considered using the SPD Engine?&lt;BR /&gt;
I would have put the data in a partitioned Oracle database.&lt;BR /&gt;
&lt;BR /&gt;
Would using "Key=.." possibly be faster than "(where=...)"&lt;BR /&gt;
&lt;BR /&gt;
What is the box?  What are you using for storage?  How are you connected to the storage?</description>
      <pubDate>Tue, 25 Mar 2008 19:37:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Optimization-of-processing-huge-data-sets/m-p/7672#M115</guid>
      <dc:creator>deleted_user</dc:creator>
      <dc:date>2008-03-25T19:37:02Z</dc:date>
    </item>
    <item>
      <title>Re: Optimization of processing huge data sets</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Optimization-of-processing-huge-data-sets/m-p/7673#M116</link>
      <description>Thank you Peter and Chuck for hints &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt; &lt;BR /&gt;
&lt;BR /&gt;
Well, it strange, but proc sql with delete statement, using indexes works twice as fast as data step... &lt;span class="lia-unicode-emoji" title=":confused_face:"&gt;😕&lt;/span&gt; &lt;BR /&gt;
&lt;BR /&gt;
I thought that data step would have to work much faster than procedures.&lt;BR /&gt;
&lt;BR /&gt;
Well, anyway, proc sql satisfies my deletion time expectations.&lt;BR /&gt;
&lt;BR /&gt;
Thanks and good luck.&lt;BR /&gt;
Sarunas</description>
      <pubDate>Wed, 26 Mar 2008 09:12:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Optimization-of-processing-huge-data-sets/m-p/7673#M116</guid>
      <dc:creator>deleted_user</dc:creator>
      <dc:date>2008-03-26T09:12:02Z</dc:date>
    </item>
    <item>
      <title>Re: Optimization of processing huge data sets</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Optimization-of-processing-huge-data-sets/m-p/7674#M117</link>
      <description>&amp;gt; Well, it strange, but proc sql with delete statement,&lt;BR /&gt;
&amp;gt; using indexes works twice as fast as data step... &lt;span class="lia-unicode-emoji" title=":confused_face:"&gt;😕&lt;/span&gt;&lt;BR /&gt;
&amp;gt; &lt;BR /&gt;
&amp;gt; &lt;BR /&gt;
&amp;gt; I thought that data step would have to work much&lt;BR /&gt;
&amp;gt; faster than procedures.&lt;BR /&gt;
&lt;BR /&gt;
Proc Sql uses a completely different code path than a data step.  Data step is much older and tends to be more serialized than SQL, so it tends not to be take advantage of multiple threads/processors as well as SQL can/does.</description>
      <pubDate>Wed, 26 Mar 2008 12:47:36 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Optimization-of-processing-huge-data-sets/m-p/7674#M117</guid>
      <dc:creator>deleted_user</dc:creator>
      <dc:date>2008-03-26T12:47:36Z</dc:date>
    </item>
  </channel>
</rss>

