DATA Step, Macro, Functions and more

Threads in DS2, no time gain

Reply
Super Contributor
Posts: 298

Threads in DS2, no time gain

Dear Experts,

I wonder why I can't get a time gain by using threaded processing in DS2. Below I create a dataset with a group-variable, then I count the numbers in each Group by threaded by processing in DS2. It turns out to take far longer than if I do the same with normal datastep. I had expected that the threaded processing should result in far better performance, so I wonder why this does not happen.

By the way, this example is just to illustrate the problem. If "counting observations" was the real problem there are better methods to do that.

*the test dataset:;

data test;

  do group=1 to 10;

    do i=1 to 1000000;

output;

end;

  end;

run;

*Count observations with DS2:;

proc ds2 bypartition=no stimer;
  thread read/overwrite=yes;
  declare double count;
  method init();
    count=0;
  end;
  method run();
   set test;
   by group i;
   if first.group then count=0;
   count+1;
   if last.group then output;
  end;
  endthread;

  data abc/overwrite=yes;
keep group count;
  declare thread read instance;
method run();
   set from instance threads=4;
   output;
end;
  run;
quit;

NOTE: DS2 query used (Total process time):

      real time           11.23 seconds

      cpu time            26.59 seconds

*In comparison, an ordinary datastep:;

data abc;

  set test;

  by group i;

  if first.group then count=0;

  count+1;

run;

NOTE: There were 10000000 observations read from the data set WORK.TEST.
NOTE: The data set WORK.ABC has 10000000 observations and 4 variables.
NOTE: DATA statement used (Total process time):
      real time           6.89 seconds
      cpu time            6.59 seconds

It can very a bit from one run to an other run, but basicly the same result came out each time. Also, changing bypartition to "yes" does not make any big change. And, Yes, I do have multiple processors on my server.

Super User
Posts: 7,866

Re: Threads in DS2, no time gain

Posted in reply to JacobSimonsen

Parallel processing of I/O intensive tasks only makes sense if the I/O can be split unto physically separate devices.

As long as the data set in question is on one device, the threads will cause colliding requests on that device and ultimately slow the process down as compared to one single, often sequential scan through the data set.

That's why the SPDE engine works best with groups of disks aligned along the number of procs.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Trusted Advisor
Posts: 3,215

Re: Threads in DS2, no time gain

Posted in reply to JacobSimonsen

It would become more interesting when you would a dataset in memory using the sasfile approach.

That would eliminate IO constraints. The most slow part with all processing.

You need to have a lot of memory but that should not be an issue these days.

The next one is the overhead in starting and maintaining threads. When that overhead is high compared to the processing it self, there you have another reason you will not improve overall speed.

---->-- ja karman --<-----
Super Contributor
Posts: 298

Re: Threads in DS2, no time gain

I have tried that also, but it doesnt help. With "SASFILE test load" before proc ds2 I get almost same result:

NOTE: DS2 query used (Total process time):

      real time          7.79 seconds

      cpu time            24.11 seconds

Trusted Advisor
Posts: 3,215

Re: Threads in DS2, no time gain

Posted in reply to JacobSimonsen

You are probably hitting the overhead starting maintain all processes.

Adding a more complicated function insyead of counting should prove that.

It is another dimension in causing load.

You have now a result the total response is almost equal but with the threading a lot of overhead is added 8 seconds finished am 25 seconds is used.

---->-- ja karman --<-----
Trusted Advisor
Posts: 3,215

Re: Threads in DS2, no time gain

Posted in reply to JacobSimonsen

You are probably hitting the overhead starting maintain all processes.

Adding a more complicated function insyead of counting should prove that.

It is another dimension in causing load.

You have now a result the total response is almost equal but with the threading a lot of overhead is added 8 seconds finished am 25 seconds is used.

---->-- ja karman --<-----
Trusted Advisor
Posts: 1,301

Re: Threads in DS2, no time gain

Posted in reply to JacobSimonsen

Your routine is not computationally complex enough to benefit from threading, you are really only adding overhead since the I/O is still in a single thread.  If instead of a simple count you may try this example from

  1. options cpucount=actual; 
  2. proc options option=cpucount;run; 
  3. libname base '/u/jaseco/tmp/base'; 
  4. data base.jmaster; 
  5.   do j = 1 to 10e6; 
  6.     output; 
  7.   end; 
  8. run; 
  9. proc ds2; 
  10.   thread r /overwrite=yes; 
  11.     dcl double count k x; 
  12.     method run(); 
  13.       set base.jmaster; 
  14.       count+1; 
  15.       do k=1 to 80;/* Add some gratuitous computation! */ 
  16.         x = k/count + k/count + k/count; 
  17.       end; 
  18.     end; 
  19.     method term(); 
  20.       OUTPUT; 
  21.     end; 
  22.   endthread; 
  23.   run; 
  24. quit; 
  25. proc ds2; 
  26.   data j1(overwrite=yes); 
  27.     dcl thread r r_instance; 
  28.     dcl double count total; 
  29.     method run(); 
  30.       set from r_instance threads=1; 
  31.       total+count; 
  32.     end; 
  33.   enddata; 
  34.   run; 
  35. quit; 
  36. proc ds2; 
  37.   data j2(overwrite=yes); 
  38.     dcl thread r r_instance; 
  39.     dcl double count total; 
  40.     method run(); 
  41.       set from r_instance threads=2; 
  42.       total+count; 
  43.     end; 
  44.   enddata; 
  45.   run; 
  46. quit; 
  47. proc ds2; 
  48.   data j4(overwrite=yes); 
  49.     dcl thread r r_instance; 
  50.     dcl double count total; 
  51.     method run(); 
  52.       set from r_instance threads=4; 
  53.       total+count; 
  54.     end; 
  55.   enddata; 
  56.   run; 
  57. quit; 
  58. proc ds2; 
  59.   data j8(overwrite=yes); 
  60.     dcl thread r r_instance; 
  61.     dcl double count total; 
  62.     method run(); 
  63.       set from r_instance threads=8; 
  64.       total+count; 
  65.     end; 
  66.   enddata; 
  67.   run; 
  68. quit; 
  69. proc ds2; 
  70.   data j16(overwrite=yes); 
  71.     dcl thread r r_instance; 
  72.     dcl double count total; 
  73.     method run(); 
  74.       set from r_instance threads=16; 
  75.       total+count; 
  76.     end; 
  77.   enddata; 
  78.   run; 
  79. quit; 
  80. /****************************/ 
  81. /* And read it in DATA step */ 
  82. /****************************/ 
  83. data jold; 
  84.   set base.jmaster end=finish; 
  85.   count+1; 
  86.   do k=1 to 80;/* Add some gratuitous computation! */ 
  87.     x = k/count + k/count + k/count; 
  88.   end; 
  89.   if finish then output; 
  90. run; 
Super Contributor
Posts: 298

Re: Threads in DS2, no time gain

You are right - when the computational task is relative larger than the I/O task, then the gain by threaded processing can be huge even though I/O is not threaded.

I tried the code you suggested and I observe that the compuation (real) time decrease alot when number of threads is increased.

When 8 threads are used:

NOTE: PROCEDURE DS2 used (Total process time):

      real time           2.40 seconds

      cpu time            17.50 seconds

When the ordinary datastep is used:

NOTE: There were 10000000 observations read from the data set BASE.JMASTER.

NOTE: The data set WORK.JOLD has 1 observations and 4 variables.

NOTE: DATA statement used (Total process time):

      real time           42.09 seconds

      cpu time            42.13 seconds

Ask a Question
Discussion stats
  • 7 replies
  • 699 views
  • 3 likes
  • 4 in conversation