<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: data step in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/data-step/m-p/168997#M264083</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;It often comes down to how you are setting up your data model. No clues are gven here about user requirements, search patterns, update frequency etc.&lt;/P&gt;&lt;P&gt;Physically, do you really need so&amp;nbsp; many indexes? Do they all have at least more that 10 distinct values with normal distribution?&lt;/P&gt;&lt;P&gt;Second, if x_final (specified as work table here, but it can't be, right?) is in a SAS Base engine, it will perform much better if you apply the indexes after the table creation.&lt;/P&gt;&lt;P&gt;Consider to use a SPDE libname instead, it will update the indexes in parallel, using multi-threading.&lt;/P&gt;&lt;P&gt;Do you need to recreate the whole table each time? Consider a append/insert approach.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Mon, 10 Feb 2014 13:40:48 GMT</pubDate>
    <dc:creator>LinusH</dc:creator>
    <dc:date>2014-02-10T13:40:48Z</dc:date>
    <item>
      <title>data step</title>
      <link>https://communities.sas.com/t5/SAS-Programming/data-step/m-p/168996#M264082</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi SAS Experts,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I have a data step:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;data x_final(sortedby=c1 c2 c3 c4 index=(d1 c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15 c16 c17));&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; set x_updated;&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; by c1 c2 c3 c4;&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;This data step takes around 3 hrs if x_updated has around 20 million records. If x_updated has around 0.25 million records, then data step takes only 5 minutes to complete. &lt;BR /&gt;My question to experts is that -&amp;gt;&lt;BR /&gt;Is there any way by which I can reduce execution hours even when source dataset has greater than or equal to 20 million records?&amp;nbsp; &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 10 Feb 2014 13:26:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/data-step/m-p/168996#M264082</guid>
      <dc:creator>kds</dc:creator>
      <dc:date>2014-02-10T13:26:59Z</dc:date>
    </item>
    <item>
      <title>Re: data step</title>
      <link>https://communities.sas.com/t5/SAS-Programming/data-step/m-p/168997#M264083</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;It often comes down to how you are setting up your data model. No clues are gven here about user requirements, search patterns, update frequency etc.&lt;/P&gt;&lt;P&gt;Physically, do you really need so&amp;nbsp; many indexes? Do they all have at least more that 10 distinct values with normal distribution?&lt;/P&gt;&lt;P&gt;Second, if x_final (specified as work table here, but it can't be, right?) is in a SAS Base engine, it will perform much better if you apply the indexes after the table creation.&lt;/P&gt;&lt;P&gt;Consider to use a SPDE libname instead, it will update the indexes in parallel, using multi-threading.&lt;/P&gt;&lt;P&gt;Do you need to recreate the whole table each time? Consider a append/insert approach.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 10 Feb 2014 13:40:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/data-step/m-p/168997#M264083</guid>
      <dc:creator>LinusH</dc:creator>
      <dc:date>2014-02-10T13:40:48Z</dc:date>
    </item>
    <item>
      <title>Re: data step</title>
      <link>https://communities.sas.com/t5/SAS-Programming/data-step/m-p/168998#M264084</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Well, it's always possible that your actual DATA step is more complex than the one you have shown here.&amp;nbsp; But looking at just this code, you don't really need a DATA step at all.&amp;nbsp; The observations and data values in X_FINAL are identical to X_UPDATED.&amp;nbsp; You could just use X_UPDATED and build your indexes (hate that word!) using a different tool that doesn't need to read in each observation individually the way a DATA step does.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Looking at the number of variables for which you are creating an index, it concerns me that the programs that use the indexes might be less efficient as well.&amp;nbsp; But you haven't posted any of that code so it's difficult to judge.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 10 Feb 2014 14:24:43 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/data-step/m-p/168998#M264084</guid>
      <dc:creator>Astounding</dc:creator>
      <dc:date>2014-02-10T14:24:43Z</dc:date>
    </item>
  </channel>
</rss>

