<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: views are faster for parallel processing. why? in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/views-are-faster-for-parallel-processing-why/m-p/67494#M14611</link>
    <description>Hello Chris.&lt;BR /&gt;
&lt;BR /&gt;
Somehow I missed this very interesting post.&lt;BR /&gt;
&lt;BR /&gt;
Will submit your code on my AIX box...&lt;BR /&gt;
&lt;BR /&gt;
Cheers from Portugal.&lt;BR /&gt;
&lt;BR /&gt;
Daniel Santos @ &lt;A href="http://www.cgd.pt" target="_blank"&gt;www.cgd.pt&lt;/A&gt;</description>
    <pubDate>Tue, 22 Sep 2009 15:09:27 GMT</pubDate>
    <dc:creator>DanielSantos</dc:creator>
    <dc:date>2009-09-22T15:09:27Z</dc:date>
    <item>
      <title>views are faster for parallel processing. why?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/views-are-faster-for-parallel-processing-why/m-p/67491#M14608</link>
      <description>We know that views can be faster for sequential processing as they potentially avoid having to write and read intermediate tables on disk.&lt;BR /&gt;
&lt;BR /&gt;
In this case, where I merge (so reads are concurrent) many tables, views are also faster. Anyone knows why?&lt;BR /&gt;
[pre]&lt;BR /&gt;
%macro t;&lt;BR /&gt;
&lt;BR /&gt;
  %do i=1 %to 30;     * create input tables and views;&lt;BR /&gt;
    data T&amp;amp;i;&lt;BR /&gt;
      retain X1-X150 1 ;&lt;BR /&gt;
      do I=1 to 100000;&lt;BR /&gt;
        output;&lt;BR /&gt;
      end;&lt;BR /&gt;
    run;&lt;BR /&gt;
    data _T&amp;amp;i/view=_T&amp;amp;i;&lt;BR /&gt;
      set T&amp;amp;i(keep=X1-X9 I);&lt;BR /&gt;
    run;&lt;BR /&gt;
  %end;&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
  data TESTA;   * merge views: 15s/30s/7mn;&lt;BR /&gt;
    merge %do i=1 %to 30;&lt;BR /&gt;
      _T&amp;amp;i(rename=(X1=X1_&amp;amp;i X2=X2_&amp;amp;i X3=X3_&amp;amp;i X4=X4_&amp;amp;i X5=X5_&amp;amp;i X6=X6_&amp;amp;i X7=X7_&amp;amp;i X8=X8_&amp;amp;i X9=X9_&amp;amp;i))&lt;BR /&gt;
       %end;        &lt;BR /&gt;
    ;by I;&lt;BR /&gt;
  run;&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
  data TESTB;   * merge tables: 25s/45s/12mn;&lt;BR /&gt;
    merge  %do i=1 %to 30;&lt;BR /&gt;
      T&amp;amp;i(keep=X1-X9 I&lt;BR /&gt;
          rename=(X1=X1_&amp;amp;i X2=X2_&amp;amp;i X3=X3_&amp;amp;i X4=X4_&amp;amp;i X5=X5_&amp;amp;i X6=X6_&amp;amp;i X7=X7_&amp;amp;i X8=X8_&amp;amp;i X9=X9_&amp;amp;i)) &lt;BR /&gt;
      %end;        &lt;BR /&gt;
    ;by I;&lt;BR /&gt;
  run; &lt;BR /&gt;
&lt;BR /&gt;
%mend; %t&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
[/pre]&lt;BR /&gt;
&lt;BR /&gt;
The 3 times are on different PCs with varying disk setups (beware: faster times on RAID0 4x15k-disk array, slowest on desktop PC).&lt;BR /&gt;
PS: Don't even think of trying this with a SQL full (outer) join, or do it over your lunch break!</description>
      <pubDate>Thu, 27 Aug 2009 04:20:18 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/views-are-faster-for-parallel-processing-why/m-p/67491#M14608</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2009-08-27T04:20:18Z</dc:date>
    </item>
    <item>
      <title>Re: views are faster for parallel processing. why?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/views-are-faster-for-parallel-processing-why/m-p/67492#M14609</link>
      <description>Just putting this in the fore once as a last attempt for a comment.&lt;BR /&gt;
Has anyone replicated this behaviour?&lt;BR /&gt;
No idea why views would be faster?&lt;BR /&gt;
&lt;BR /&gt;
2 explanations I can think of:&lt;BR /&gt;
&lt;BR /&gt;
- When using views, input tables are read sequentially.&lt;BR /&gt;
When using tables, all tables are read concurrently one record at a time (which would thrash the disks). &lt;BR /&gt;
But then where is the data stored? Much higher memory usage with views is in no way sufficient to store all the data. &lt;BR /&gt;
Maybe tables are just read in much larger blocks when using views?&lt;BR /&gt;
&lt;BR /&gt;
- The keep statement is not used as efficiently when using tables, and full observations are read, even if not stored in the limited PDV.&lt;BR /&gt;
&lt;BR /&gt;
I added the option bufno=1k when reading tables T&amp;amp;i and times decreased dramatically (but memory usage stayed the same somehow). &lt;BR /&gt;
So it looks like the 1st idea might be right: views use larger blocks. &lt;BR /&gt;
&lt;BR /&gt;
Any comment?</description>
      <pubDate>Tue, 22 Sep 2009 00:51:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/views-are-faster-for-parallel-processing-why/m-p/67492#M14609</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2009-09-22T00:51:35Z</dc:date>
    </item>
    <item>
      <title>Re: views are faster for parallel processing. why?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/views-are-faster-for-parallel-processing-why/m-p/67493#M14610</link>
      <description>Chris,&lt;BR /&gt;
if we move keep option from data step that create views to step that merge views then testa takes much longer time on my PC:&lt;BR /&gt;
&lt;BR /&gt;
NOTE: The data set WORK.TESTA has 100000 observations and 271 variables.&lt;BR /&gt;
NOTE: Compressing data set WORK.TESTA decreased size by 49.95 percent.&lt;BR /&gt;
      Compressed is 7151 pages; un-compressed would require 14288 pages.&lt;BR /&gt;
NOTE: DATA statement used (Total process time):&lt;BR /&gt;
      real time           13:19.97&lt;BR /&gt;
      cpu time            29.10 seconds&lt;BR /&gt;
&lt;BR /&gt;
NOTE: The data set WORK.TESTB has 100000 observations and 271 variables.&lt;BR /&gt;
NOTE: Compressing data set WORK.TESTB decreased size by 49.95 percent.&lt;BR /&gt;
      Compressed is 7151 pages; un-compressed would require 14288 pages.&lt;BR /&gt;
NOTE: DATA statement used (Total process time):&lt;BR /&gt;
      real time           6:12.93&lt;BR /&gt;
      cpu time            19.17 seconds&lt;BR /&gt;
&lt;BR /&gt;
So, I think that Your 2nd explanation is right.</description>
      <pubDate>Tue, 22 Sep 2009 08:39:25 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/views-are-faster-for-parallel-processing-why/m-p/67493#M14610</guid>
      <dc:creator>Oleg_L</dc:creator>
      <dc:date>2009-09-22T08:39:25Z</dc:date>
    </item>
    <item>
      <title>Re: views are faster for parallel processing. why?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/views-are-faster-for-parallel-processing-why/m-p/67494#M14611</link>
      <description>Hello Chris.&lt;BR /&gt;
&lt;BR /&gt;
Somehow I missed this very interesting post.&lt;BR /&gt;
&lt;BR /&gt;
Will submit your code on my AIX box...&lt;BR /&gt;
&lt;BR /&gt;
Cheers from Portugal.&lt;BR /&gt;
&lt;BR /&gt;
Daniel Santos @ &lt;A href="http://www.cgd.pt" target="_blank"&gt;www.cgd.pt&lt;/A&gt;</description>
      <pubDate>Tue, 22 Sep 2009 15:09:27 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/views-are-faster-for-parallel-processing-why/m-p/67494#M14611</guid>
      <dc:creator>DanielSantos</dc:creator>
      <dc:date>2009-09-22T15:09:27Z</dc:date>
    </item>
    <item>
      <title>Re: views are faster for parallel processing. why?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/views-are-faster-for-parallel-processing-why/m-p/67495#M14612</link>
      <description>Done.&lt;BR /&gt;
&lt;BR /&gt;
Although the merge with views executed on less time, performance is very identical...&lt;BR /&gt;
&lt;BR /&gt;
Merge with views:&lt;BR /&gt;
&lt;BR /&gt;
NOTE: DATA statement used (Total process time):&lt;BR /&gt;
      real time           16:24.78&lt;BR /&gt;
      user cpu time       41.00 seconds&lt;BR /&gt;
      system cpu time     32.28 seconds&lt;BR /&gt;
      Memory                            22663k&lt;BR /&gt;
      Page Faults                       185862&lt;BR /&gt;
      Page Reclaims                     15291&lt;BR /&gt;
      Page Swaps                        0&amp;#12;&lt;BR /&gt;
      Voluntary Context Switches        9793&lt;BR /&gt;
      Involuntary Context Switches      6277&lt;BR /&gt;
      Block Input Operations            0&lt;BR /&gt;
      Block Output Operations           0&lt;BR /&gt;
&lt;BR /&gt;
Merge with tables:&lt;BR /&gt;
&lt;BR /&gt;
NOTE: DATA statement used (Total process time):&lt;BR /&gt;
      real time           16:29.87&lt;BR /&gt;
      user cpu time       37.82 seconds&lt;BR /&gt;
      system cpu time     29.61 seconds&lt;BR /&gt;
      Memory                            21024k&lt;BR /&gt;
      Page Faults                       185651&lt;BR /&gt;
      Page Reclaims                     8430&lt;BR /&gt;
      Page Swaps                        0&lt;BR /&gt;
      Voluntary Context Switches        9780&lt;BR /&gt;
      Involuntary Context Switches      8154&lt;BR /&gt;
      Block Input Operations            0&lt;BR /&gt;
      Block Output Operations           0&lt;BR /&gt;
&lt;BR /&gt;
- Page faults are about the same (I/O) but page reclaims (done in memory) are much higher (almost double) for the merge/views, which may explain the 5 seconds difference.&lt;BR /&gt;
- Memory was about the same in both techniques.&lt;BR /&gt;
- cpu usage was a little less with merge/tables, which is reasonable, since views should have some processing overhead to map/resolve the logical definition to the physical data.&lt;BR /&gt;
&lt;BR /&gt;
But your theory may be actually right (views by blocks vs tables by rows), the page reclaims difference may indicate that. Data read into memory with a larger block size (and retrieved from there) vs. data read into memory with a smaller block size, and much more I/O operations.&lt;BR /&gt;
Anyway, as always for performance, it will be everytime a system dependent matter. Base on the above theory and for our system, although merge/views performed less I/O operations the total processing time remained pretty much the same.&lt;BR /&gt;
&lt;BR /&gt;
Cheers from Portugal.&lt;BR /&gt;
&lt;BR /&gt;
Daniel Santos @ &lt;A href="http://www.cgd.pt" target="_blank"&gt;www.cgd.pt&lt;/A&gt;.</description>
      <pubDate>Tue, 22 Sep 2009 15:33:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/views-are-faster-for-parallel-processing-why/m-p/67495#M14612</guid>
      <dc:creator>DanielSantos</dc:creator>
      <dc:date>2009-09-22T15:33:02Z</dc:date>
    </item>
    <item>
      <title>Re: views are faster for parallel processing. why?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/views-are-faster-for-parallel-processing-why/m-p/67496#M14613</link>
      <description>Oh, by the way. Never had stopped to think about this subject, thanks for bringing that up!&lt;BR /&gt;
&lt;BR /&gt;
Cheers from Portugal.&lt;BR /&gt;
&lt;BR /&gt;
Daniel Santos @ &lt;A href="http://www.cgd.pt" target="_blank"&gt;www.cgd.pt&lt;/A&gt;</description>
      <pubDate>Tue, 22 Sep 2009 16:32:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/views-are-faster-for-parallel-processing-why/m-p/67496#M14613</guid>
      <dc:creator>DanielSantos</dc:creator>
      <dc:date>2009-09-22T16:32:12Z</dc:date>
    </item>
    <item>
      <title>Re: views are faster for parallel processing. why?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/views-are-faster-for-parallel-processing-why/m-p/67497#M14614</link>
      <description>Thanks for your input guys, much appreciated.&lt;BR /&gt;
Nice machine you have,Daniel!</description>
      <pubDate>Thu, 24 Sep 2009 20:52:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/views-are-faster-for-parallel-processing-why/m-p/67497#M14614</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2009-09-24T20:52:44Z</dc:date>
    </item>
    <item>
      <title>Re: views are faster for parallel processing. why?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/views-are-faster-for-parallel-processing-why/m-p/67498#M14615</link>
      <description>Solaris 10. Quite strong production box. I didn't get the thing. Tables are faster.&lt;BR /&gt;
&lt;BR /&gt;
NOTE: The data set WORK.TESTA has 100000 observations and 271 variables.&lt;BR /&gt;
NOTE: DATA statement used (Total process time):&lt;BR /&gt;
      real time           49.00 seconds&lt;BR /&gt;
      user cpu time       17.17 seconds&lt;BR /&gt;
      system cpu time     29.58 seconds&lt;BR /&gt;
      Memory                            12129k&lt;BR /&gt;
      Page Faults                       7&lt;BR /&gt;
      Page Reclaims                     0&lt;BR /&gt;
      Page Swaps                        0&lt;BR /&gt;
      Voluntary Context Switches        814&lt;BR /&gt;
      Involuntary Context Switches      713&lt;BR /&gt;
      Block Input Operations            1&lt;BR /&gt;
      Block Output Operations           1&lt;BR /&gt;
&lt;BR /&gt;
NOTE: The data set WORK.TESTB has 100000 observations and 271 variables.&lt;BR /&gt;
NOTE: DATA statement used (Total process time):&lt;BR /&gt;
      real time           52.00 seconds&lt;BR /&gt;
      user cpu time       16.01 seconds&lt;BR /&gt;
      system cpu time     35.29 seconds&lt;BR /&gt;
      Memory                            10482k&lt;BR /&gt;
      Page Faults                       0&lt;BR /&gt;
      Page Reclaims                     0&lt;BR /&gt;
      Page Swaps                        0&lt;BR /&gt;
      Voluntary Context Switches        877&lt;BR /&gt;
      Involuntary Context Switches      849&lt;BR /&gt;
      Block Input Operations            0&lt;BR /&gt;
      Block Output Operations           1</description>
      <pubDate>Wed, 30 Sep 2009 11:40:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/views-are-faster-for-parallel-processing-why/m-p/67498#M14615</guid>
      <dc:creator>Opa4ki</dc:creator>
      <dc:date>2009-09-30T11:40:56Z</dc:date>
    </item>
    <item>
      <title>Re: views are faster for parallel processing. why?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/views-are-faster-for-parallel-processing-why/m-p/67499#M14616</link>
      <description>In your case, I would say that it is a matter of "too much" power. &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;BR /&gt;
&lt;BR /&gt;
Maybe you could escalate Chris's code to suit your platform.&lt;BR /&gt;
&lt;BR /&gt;
And could please share your hardware setup? Nice system you have there.&lt;BR /&gt;
&lt;BR /&gt;
Cheers from Portugal.&lt;BR /&gt;
&lt;BR /&gt;
Daniel Santos @ &lt;A href="http://www.cgd.pt" target="_blank"&gt;www.cgd.pt&lt;/A&gt;</description>
      <pubDate>Thu, 01 Oct 2009 07:07:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/views-are-faster-for-parallel-processing-why/m-p/67499#M14616</guid>
      <dc:creator>DanielSantos</dc:creator>
      <dc:date>2009-10-01T07:07:39Z</dc:date>
    </item>
    <item>
      <title>Re: views are faster for parallel processing. why?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/views-are-faster-for-parallel-processing-why/m-p/67500#M14617</link>
      <description>uname -a&lt;BR /&gt;
SunOS two 5.10 Generic_138888-08 sun4u sparc SUNW,Sun-Fire-15000&lt;BR /&gt;
&lt;BR /&gt;
20 sparcv9 processors which go in 40 virtual processors with 1800 MHz&lt;BR /&gt;
vmstat shows ~120G free swap and 110G free physical memory.&lt;BR /&gt;
&lt;BR /&gt;
DWH ETL in telecom&lt;BR /&gt;
&lt;BR /&gt;
About the tests. I don't think i'm going to spend time on something meaningful in this way in near time.</description>
      <pubDate>Thu, 01 Oct 2009 10:05:57 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/views-are-faster-for-parallel-processing-why/m-p/67500#M14617</guid>
      <dc:creator>Opa4ki</dc:creator>
      <dc:date>2009-10-01T10:05:57Z</dc:date>
    </item>
    <item>
      <title>Re: views are faster for parallel processing. why?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/views-are-faster-for-parallel-processing-why/m-p/67501#M14618</link>
      <description>It looks like this is a Windows platform behaviour then, probably due to different sas buffer defaults for views and tables.&lt;BR /&gt;
&lt;BR /&gt;
It is well worth knowing about this in any case: one process was reduced from 2 hours to 10 min. &lt;BR /&gt;
&lt;BR /&gt;
I am very curious as to what changes might happen in the sas engine for such dramatic improvements to take place.</description>
      <pubDate>Sun, 04 Oct 2009 06:01:42 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/views-are-faster-for-parallel-processing-why/m-p/67501#M14618</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2009-10-04T06:01:42Z</dc:date>
    </item>
  </channel>
</rss>

