<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic How to rightly use proc ds2 multiple threads to speed up data step? in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/How-to-rightly-use-proc-ds2-multiple-threads-to-speed-up-data/m-p/906643#M357993</link>
    <description>&lt;P&gt;I am learning to write PROC DS2, some materials says there is a threads= option, which enables SAS use multiple threads to run program, so SAS can run faster. I write the following program:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data test1;
  set sashelp.class;
  do i=1 to 3e6;
    output;
  end;
run;

*Normal data step;
data test2;
  set test1;
  sum_age+age;
run;

proc delete data=test2;
run;

*Data step in DS2, without multiple threads;
proc ds2;
  data test3/overwrite=yes;
    method run();
      dcl int sum_age;
      set test1;
      sum_age+age;
    end;
  enddata;
  run;
quit;

proc delete data=test3;
run;

*Data step in DS2, with multiple threads;
proc ds2;
  thread t/overwrite=yes;
    method run();
    dcl int sum_age;
    set test1;
    sum_age+age;
    end;
  endthread;
  run;

  data test4/overwrite=yes;
    dcl thread t t_instance;
    method run();
      set from t_instance threads=8;
    end;
  enddata;
  run;
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Well, I just can't understand the running time:&lt;/P&gt;
&lt;PRE&gt;Normal data step
real time: 2.97 seconds
cpu time: 2.87 seconds

Data step in DS2, without multiple threads
real time: 6.83 seconds
cpu time: 8.06 seconds

Data step in DS2, with multiple threads
real time: 4.47 seconds
cpu time: 13.26 seconds&lt;/PRE&gt;
&lt;P&gt;And I write proc delete before taking each test, to keep the WORK library clean. My questions are:&lt;/P&gt;
&lt;P&gt;1. Why data step in ds2&amp;nbsp;without multiple threads costs much time than normal data step?&lt;/P&gt;
&lt;P&gt;2.&amp;nbsp;Why data step&amp;nbsp;in ds2&amp;nbsp;with multiple&amp;nbsp;threads&amp;nbsp;&amp;nbsp;costs much time than normal data step? Even I let threads=16, there is not significant difference on real time.&lt;/P&gt;</description>
    <pubDate>Thu, 07 Dec 2023 02:33:12 GMT</pubDate>
    <dc:creator>whymath</dc:creator>
    <dc:date>2023-12-07T02:33:12Z</dc:date>
    <item>
      <title>How to rightly use proc ds2 multiple threads to speed up data step?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-rightly-use-proc-ds2-multiple-threads-to-speed-up-data/m-p/906643#M357993</link>
      <description>&lt;P&gt;I am learning to write PROC DS2, some materials says there is a threads= option, which enables SAS use multiple threads to run program, so SAS can run faster. I write the following program:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data test1;
  set sashelp.class;
  do i=1 to 3e6;
    output;
  end;
run;

*Normal data step;
data test2;
  set test1;
  sum_age+age;
run;

proc delete data=test2;
run;

*Data step in DS2, without multiple threads;
proc ds2;
  data test3/overwrite=yes;
    method run();
      dcl int sum_age;
      set test1;
      sum_age+age;
    end;
  enddata;
  run;
quit;

proc delete data=test3;
run;

*Data step in DS2, with multiple threads;
proc ds2;
  thread t/overwrite=yes;
    method run();
    dcl int sum_age;
    set test1;
    sum_age+age;
    end;
  endthread;
  run;

  data test4/overwrite=yes;
    dcl thread t t_instance;
    method run();
      set from t_instance threads=8;
    end;
  enddata;
  run;
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Well, I just can't understand the running time:&lt;/P&gt;
&lt;PRE&gt;Normal data step
real time: 2.97 seconds
cpu time: 2.87 seconds

Data step in DS2, without multiple threads
real time: 6.83 seconds
cpu time: 8.06 seconds

Data step in DS2, with multiple threads
real time: 4.47 seconds
cpu time: 13.26 seconds&lt;/PRE&gt;
&lt;P&gt;And I write proc delete before taking each test, to keep the WORK library clean. My questions are:&lt;/P&gt;
&lt;P&gt;1. Why data step in ds2&amp;nbsp;without multiple threads costs much time than normal data step?&lt;/P&gt;
&lt;P&gt;2.&amp;nbsp;Why data step&amp;nbsp;in ds2&amp;nbsp;with multiple&amp;nbsp;threads&amp;nbsp;&amp;nbsp;costs much time than normal data step? Even I let threads=16, there is not significant difference on real time.&lt;/P&gt;</description>
      <pubDate>Thu, 07 Dec 2023 02:33:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-rightly-use-proc-ds2-multiple-threads-to-speed-up-data/m-p/906643#M357993</guid>
      <dc:creator>whymath</dc:creator>
      <dc:date>2023-12-07T02:33:12Z</dc:date>
    </item>
    <item>
      <title>Re: How to rightly use proc ds2 multiple threads to speed up data step?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-rightly-use-proc-ds2-multiple-threads-to-speed-up-data/m-p/906688#M358007</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/270406"&gt;@whymath&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Check out this paper "&lt;A title="User-Defined Multithreading with the SAS® DS2 Procedure: Performance Testing DS2 Against Functionally Equivalent DATA Steps" href="https://www.lexjansen.com/pharmasug/2019/AD/PharmaSUG-2019-AD-228.pdf" target="_blank" rel="noopener"&gt;User-Defined Multithreading with the SAS® DS2 Procedure: Performance Testing DS2 Against Functionally Equivalent DATA Steps&lt;/A&gt;"&lt;/P&gt;
&lt;P&gt;by Troy Martin Hughes&amp;nbsp;&lt;/P&gt;
&lt;P&gt;He does an excellent job in explaining what you are observing.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Hope this helps,&lt;/P&gt;
&lt;P&gt;Ahmed&lt;/P&gt;</description>
      <pubDate>Thu, 07 Dec 2023 11:10:21 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-rightly-use-proc-ds2-multiple-threads-to-speed-up-data/m-p/906688#M358007</guid>
      <dc:creator>AhmedAl_Attar</dc:creator>
      <dc:date>2023-12-07T11:10:21Z</dc:date>
    </item>
  </channel>
</rss>

