<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: ds2 vs proc sql in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/ds2-vs-proc-sql/m-p/464595#M284940</link>
    <description>&lt;P&gt;Oh wow! Good find!&lt;/P&gt;
&lt;P&gt;You should definitely check with tech support whether this is expected behaviour.&lt;/P&gt;
&lt;P&gt;There is no reason compressed data sets should kill speed. That's crazy.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;About the grid: can you see the threads running in the different nodes?&lt;/P&gt;</description>
    <pubDate>Wed, 23 May 2018 22:44:34 GMT</pubDate>
    <dc:creator>ChrisNZ</dc:creator>
    <dc:date>2018-05-23T22:44:34Z</dc:date>
    <item>
      <title>ds2 vs proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Programming/ds2-vs-proc-sql/m-p/463057#M284935</link>
      <description>&lt;P&gt;I'm new to DS2, but learning.&amp;nbsp; We would like to use DS2 to cut down on processing time for certain processes we run.&amp;nbsp; From some simple testing, I'm finding a DS2 threaded process runs slower and takes almost twice as long&amp;nbsp;than if I use PROC SQL.&amp;nbsp; This is the case even though I can see that DS2 uses several thread processes.&amp;nbsp; I find this quite puzzling since I had hoped DS2 would run faster.&amp;nbsp; Any input / comments why DS2 is running slower?&amp;nbsp;&amp;nbsp; Query and logs below.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;/* proc sql for comparing to ds2 thread step below -- basically copying a compressed dataset */&lt;BR /&gt;proc sql;&lt;BR /&gt;create table lib1.ds2_test_procsql as&lt;BR /&gt;select segment, region, state, MBR_ENRLMNT_CNT from lib1.final;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt;NOTE: Table LIB1.DS2_TEST_PROCSQL created, with 35811639 rows and 4 columns.&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt;NOTE: PROCEDURE SQL used (Total process time):&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="1 2 3 4 5 6 7" color="#FF0000"&gt; real time 2:04.90&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt; user cpu time 54.25 seconds&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt; system cpu time 7.44 seconds&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt; memory 6290.06k&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt; OS Memory 28840.00k&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt; Timestamp 05/17/2018 08:43:26 AM&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt; Step Count 39 Switch Count 284&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt; Page Faults 0&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt; Page Reclaims 163&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt; Page Swaps 0&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt; Voluntary Context Switches 3487&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt; Involuntary Context Switches 33598&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt; Block Input Operations 0&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt; Block Output Operations 0&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;/* ds2 thread steps */&lt;BR /&gt;proc ds2; &lt;BR /&gt; thread newton/overwrite=yes; &lt;BR /&gt; dcl double y count; &lt;BR /&gt; dcl bigint thisThread;&lt;BR /&gt; drop count y thisThread;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;method run(); &lt;BR /&gt; set {select segment, region, state, MBR_ENRLMNT_CNT from lib1.final};&lt;BR /&gt; thisThread= _threadid_; &lt;BR /&gt; count+1;&lt;BR /&gt; end;&lt;BR /&gt; &lt;BR /&gt; method term();&lt;BR /&gt; put '**Thread' _threadid_ 'processed' count 'rows:';&lt;BR /&gt; end;&lt;BR /&gt; endthread; &lt;BR /&gt; run; &lt;BR /&gt;quit;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;proc ds2; &lt;BR /&gt; data lib1.ds2_test_thread / overwrite=yes;&lt;BR /&gt; dcl thread newton frac;&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt; method run(); &lt;BR /&gt; set from frac threads=4; &lt;BR /&gt; end; &lt;BR /&gt; enddata;&lt;BR /&gt; run; &lt;BR /&gt;quit;&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt;**Thread 1 processed 4086810 rows:&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt;**Thread 0 processed 4037670 rows:&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt;**Thread 3 processed 13699140 rows:&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt;**Thread 2 processed 13988019 rows:&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt;NOTE: Execution succeeded. 35811639 rows affected.&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt;NOTE: PROCEDURE DS2 used (Total process time):&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="1 2 3 4 5 6 7" color="#FF0000"&gt; real time 3:47.06&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt; user cpu time 1:49.49&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt; system cpu time 15.59 seconds&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt; memory 5952.46k&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt; OS Memory 28336.00k&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt; Timestamp 05/17/2018 08:47:14 AM&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt; Step Count 42 Switch Count 66&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt; Page Faults 0&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt; Page Reclaims 1620&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt; Page Swaps 0&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt; Voluntary Context Switches 56263&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt; Involuntary Context Switches 64441&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt; Block Input Operations 0&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt; Block Output Operations 0&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 17 May 2018 16:51:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/ds2-vs-proc-sql/m-p/463057#M284935</guid>
      <dc:creator>alandool</dc:creator>
      <dc:date>2018-05-17T16:51:59Z</dc:date>
    </item>
    <item>
      <title>Re: ds2 vs proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Programming/ds2-vs-proc-sql/m-p/463209#M284936</link>
      <description>&lt;P&gt;Several elements to answer your questions:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;1- Is your process CPU-bound? Multi-threading only makes sense if the CPU is the bottleneck.&amp;nbsp;If not, the overheard of managing and synchronising threads makes things worse.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;2- If the disk is the bottleneck (as it usually is with SAS jobs), m&lt;SPAN&gt;ulti-threading makes things worse as each thread accesses the disk independently (rather than one thread accessing the disk sequentially), making the disk I/O operations more random.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;3- For the reasons&amp;nbsp;developed&amp;nbsp;above, proc ds2 is mostly aimed at distributed/grid (or at least multi-path) environments.&lt;BR /&gt;&amp;nbsp; &amp;nbsp;Using multi-path SPDE partitioned storage does not qualify, unless you can limit each thread to&amp;nbsp;its own location.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp;Otherwise, we are back to threads fighting for the data and killing I/O throughput.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;4- proc ds2 is more complex and this means it incurs an overhead; so unless the conditions above are met, performance takes a hit when compared to traditional SAS programming methods.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;At least that's my (limited) experience of DS2.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 18 May 2018 03:59:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/ds2-vs-proc-sql/m-p/463209#M284936</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2018-05-18T03:59:28Z</dc:date>
    </item>
    <item>
      <title>Re: ds2 vs proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Programming/ds2-vs-proc-sql/m-p/463755#M284937</link>
      <description>&lt;P&gt;Thank you for your reply and information,&amp;nbsp;&lt;SPAN class="login-bold"&gt;&lt;A id="link_13" class="lia-link-navigation lia-page-link lia-user-name-link" href="https://communities.sas.com/t5/user/viewprofilepage/user-id/16961" target="_self"&gt;ChrisNZ&lt;/A&gt;,&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="login-bold"&gt;&lt;SPAN&gt;What you said about proc ds2 complexity and overhead makes sense.&amp;nbsp;&amp;nbsp;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="login-bold"&gt;We do have a grid environment.&amp;nbsp; I'm not sure how I would go about determining items #1 and #2 in your list.&amp;nbsp; (#1 I've read about in the past, but don't remember how to determine)&amp;nbsp;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="login-bold"&gt;For our team, there will be no incentive to convert jobs to DS2 if there is no processing speed improvements using threads.&amp;nbsp; That was the main focus of this little experiment.&amp;nbsp; At this point I am rather disappointed.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="login-bold"&gt;Alan&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 21 May 2018 11:37:51 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/ds2-vs-proc-sql/m-p/463755#M284937</guid>
      <dc:creator>alandool</dc:creator>
      <dc:date>2018-05-21T11:37:51Z</dc:date>
    </item>
    <item>
      <title>Re: ds2 vs proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Programming/ds2-vs-proc-sql/m-p/463927#M284938</link>
      <description>&lt;P&gt;The log is usually the first thing to look at to know if the processor is the bottleneck, though when the process is multi-threaded, interpretation becomes more difficult.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I have no experience with SAS Grid (that's high on my to-do list), but I suppose it makes the log even more ... interesting ...&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Most references of DS2 only mention in-database accelerators and CAS, so it may be that these are the main targets for DS2 rather than SAS Grid. That would be sad.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Here is a paper showing how DS2 can still be beneficial in a traditional environment.&lt;/P&gt;
&lt;P&gt;&lt;A href="http://support.sas.com/resources/papers/proceedings16/3780-2016.pdf" target="_blank"&gt;http://support.sas.com/resources/papers/proceedings16/3780-2016.pdf&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 21 May 2018 22:44:18 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/ds2-vs-proc-sql/m-p/463927#M284938</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2018-05-21T22:44:18Z</dc:date>
    </item>
    <item>
      <title>Re: ds2 vs proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Programming/ds2-vs-proc-sql/m-p/464539#M284939</link>
      <description>&lt;P&gt;Hi &lt;A class="lia-link-navigation lia-page-link lia-user-name-link" id="link_8" style="color: rgb(51, 153, 102);" href="https://communities.sas.com/t5/user/viewprofilepage/user-id/16961" target="_self"&gt;&lt;SPAN class="login-bold"&gt;ChrisNZ&lt;/SPAN&gt;&lt;/A&gt;,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thank you for your input again!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I think I found the issue.&amp;nbsp; The DS2 threaded program seems to run &lt;STRONG&gt;slower&lt;/STRONG&gt; b/c the SAS dataset is &lt;STRONG&gt;compressed&lt;/STRONG&gt; that it is reading in.&amp;nbsp; It runs faster than the proc sql program&amp;nbsp;when not using a compressed dataset.&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks for pointing me to that particular PDF.&amp;nbsp; I used a sample DS2 program within it.&amp;nbsp; Using that sample program lead me to the discovery mentioned above.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;We often use compressed datasets, so if that is the case then DS2 will not&amp;nbsp;speed up many of our processes.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Alan&lt;/P&gt;</description>
      <pubDate>Wed, 23 May 2018 19:26:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/ds2-vs-proc-sql/m-p/464539#M284939</guid>
      <dc:creator>alandool</dc:creator>
      <dc:date>2018-05-23T19:26:22Z</dc:date>
    </item>
    <item>
      <title>Re: ds2 vs proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Programming/ds2-vs-proc-sql/m-p/464595#M284940</link>
      <description>&lt;P&gt;Oh wow! Good find!&lt;/P&gt;
&lt;P&gt;You should definitely check with tech support whether this is expected behaviour.&lt;/P&gt;
&lt;P&gt;There is no reason compressed data sets should kill speed. That's crazy.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;About the grid: can you see the threads running in the different nodes?&lt;/P&gt;</description>
      <pubDate>Wed, 23 May 2018 22:44:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/ds2-vs-proc-sql/m-p/464595#M284940</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2018-05-23T22:44:34Z</dc:date>
    </item>
  </channel>
</rss>

