<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How do I distribute tasks in the IML action in SAS/IML Software and Matrix Computations</title>
    <link>https://communities.sas.com/t5/SAS-IML-Software-and-Matrix/How-do-I-distribute-tasks-in-the-IML-action/m-p/836278#M5871</link>
    <description>&lt;P&gt;A few thoughts:&lt;/P&gt;
&lt;P&gt;1. I don't know the details of your program, but you can't "combine" ParTasks and MapReduce. Use one or the other. You should use ParTasks when the tasks on each thread are different. Use MapReduce when the tasks on each thread are similar.&lt;/P&gt;
&lt;P&gt;2. While it is possible to partition data so that some observations are on one node and some are on others, I've never done it.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;3. Threads on the same node share the same resources for the node. So in SMP mode (controller only), all your threads can process the same data. Perhaps the threads can process different customers?&amp;nbsp;The first thread could keep and process only A-F, the second thread G-M, and so on.&amp;nbsp; If so, you don't need to partition the data across nodes.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;My suggestion: Try to solve this problem on one node by using multiple threads. To get started, create a CAS table with about 100 fake customers and see if you can write a&amp;nbsp; MapReduce task that uses 4 threads in which each thread processes only a subset of the data, such as A-F, G-M, N-S, and T-Z.&amp;nbsp; As a first program, see if you can return the number of observations that each thread processes.&lt;/P&gt;</description>
    <pubDate>Sat, 01 Oct 2022 10:44:05 GMT</pubDate>
    <dc:creator>Rick_SAS</dc:creator>
    <dc:date>2022-10-01T10:44:05Z</dc:date>
    <item>
      <title>How do I distribute tasks in the IML action</title>
      <link>https://communities.sas.com/t5/SAS-IML-Software-and-Matrix/How-do-I-distribute-tasks-in-the-IML-action/m-p/836263#M5864</link>
      <description>I have a large dataset with a functioning program as an IML action.&lt;BR /&gt;But it takes a long time to accomplish it.&lt;BR /&gt;Beside amendments I could introduce to run it more efficiently, I wonder how I can use several nodes and threads for this computation.&lt;BR /&gt;&lt;BR /&gt;&lt;A href="https://documentation.sas.com/doc/en/pgmsascdc/v_006/casactiml/casactiml_iml_examples16.htm" target="_blank"&gt;https://documentation.sas.com/doc/en/pgmsascdc/v_006/casactiml/casactiml_iml_examples16.htm&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;Reading the documentation and its examples, I think that ParTasks together with the MapReduce function should help me out.&lt;BR /&gt;Am I on the right track  ?&lt;BR /&gt;For the logic to unfold I need to have control over how the data will be distributed.&lt;BR /&gt;Imagine having 8 millions of rows with 1 million different customers. I need all entries from one customer to be sent to the same thread. If not, the logic doesn't play out. If this criteria is met, then the MapReduce should be easy to implement.&lt;BR /&gt;&lt;BR /&gt;So I must send customers whose surnames start with A,B,C to thread number 1, D--F to thread 2, ...&lt;BR /&gt;&lt;BR /&gt;Is there any example out there that combines ParTasks and MapReduce?&lt;BR /&gt;&lt;BR /&gt;Thanks</description>
      <pubDate>Sat, 01 Oct 2022 08:46:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-IML-Software-and-Matrix/How-do-I-distribute-tasks-in-the-IML-action/m-p/836263#M5864</guid>
      <dc:creator>acordes</dc:creator>
      <dc:date>2022-10-01T08:46:02Z</dc:date>
    </item>
    <item>
      <title>Re: How do I distribute tasks in the IML action</title>
      <link>https://communities.sas.com/t5/SAS-IML-Software-and-Matrix/How-do-I-distribute-tasks-in-the-IML-action/m-p/836276#M5870</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I'm also very curious about the answer to your question.&lt;/P&gt;
&lt;P&gt;Especially this part : &amp;lt;&amp;lt;&amp;nbsp;&lt;SPAN&gt;need to have control over how the data will be distributed. &amp;gt;&amp;gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;And probably my colleague&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/267850"&gt;@DaanBijkerk&lt;/a&gt;&amp;nbsp;is interested as well.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I am sure&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13684"&gt;@Rick_SAS&lt;/a&gt;&amp;nbsp;can help you out !&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I can only say this :&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A href="https://go.documentation.sas.com/doc/en/pgmsascdc/v_028/casactiml/cas-iml-iml.htm#SAS.cas-iml-iml-nthreads" target="_blank" rel="noopener"&gt;The THREADS= option&lt;/A&gt; specifies the &lt;STRONG&gt;maximum&lt;/STRONG&gt; number of threads that might be used. Not every program will use all threads.&lt;/LI&gt;
&lt;LI&gt;MAPREDUCE function is indeed distributing computations in parallel across threads (and nodes, if you are running on a grid of machines).&lt;/LI&gt;
&lt;LI&gt;If your code does not use any functions that are multithreaded or that are distributed, then the program will run in a single thread (also in PROC CAS using the IML action set).&lt;/LI&gt;
&lt;LI&gt;Regarding what runs in parallel, see &lt;A href="https://go.documentation.sas.com/doc/en/pgmsascdc/v_028/casactiml/casactiml_iml_details01.htm" target="_blank" rel="noopener"&gt;SAS Help Center: The iml Action&lt;/A&gt;&lt;BR /&gt;which contains the quote, “When you run a traditional PROC IML program in the&amp;nbsp;iml&amp;nbsp;action, the program runs in a single thread on the controller node. For a list of SAS/IML functions that are not supported by the&amp;nbsp;iml&amp;nbsp;action, see&amp;nbsp;&lt;A href="https://go.documentation.sas.com/doc/en/pgmsascdc/v_028/casactiml/casactiml_iml_details02.htm" target="_blank" rel="noopener"&gt;Differences between the IML Procedure and the iml Action&lt;/A&gt;.”&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Best,&lt;/P&gt;
&lt;P&gt;Koen&lt;/P&gt;</description>
      <pubDate>Sat, 01 Oct 2022 10:32:54 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-IML-Software-and-Matrix/How-do-I-distribute-tasks-in-the-IML-action/m-p/836276#M5870</guid>
      <dc:creator>sbxkoenk</dc:creator>
      <dc:date>2022-10-01T10:32:54Z</dc:date>
    </item>
    <item>
      <title>Re: How do I distribute tasks in the IML action</title>
      <link>https://communities.sas.com/t5/SAS-IML-Software-and-Matrix/How-do-I-distribute-tasks-in-the-IML-action/m-p/836278#M5871</link>
      <description>&lt;P&gt;A few thoughts:&lt;/P&gt;
&lt;P&gt;1. I don't know the details of your program, but you can't "combine" ParTasks and MapReduce. Use one or the other. You should use ParTasks when the tasks on each thread are different. Use MapReduce when the tasks on each thread are similar.&lt;/P&gt;
&lt;P&gt;2. While it is possible to partition data so that some observations are on one node and some are on others, I've never done it.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;3. Threads on the same node share the same resources for the node. So in SMP mode (controller only), all your threads can process the same data. Perhaps the threads can process different customers?&amp;nbsp;The first thread could keep and process only A-F, the second thread G-M, and so on.&amp;nbsp; If so, you don't need to partition the data across nodes.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;My suggestion: Try to solve this problem on one node by using multiple threads. To get started, create a CAS table with about 100 fake customers and see if you can write a&amp;nbsp; MapReduce task that uses 4 threads in which each thread processes only a subset of the data, such as A-F, G-M, N-S, and T-Z.&amp;nbsp; As a first program, see if you can return the number of observations that each thread processes.&lt;/P&gt;</description>
      <pubDate>Sat, 01 Oct 2022 10:44:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-IML-Software-and-Matrix/How-do-I-distribute-tasks-in-the-IML-action/m-p/836278#M5871</guid>
      <dc:creator>Rick_SAS</dc:creator>
      <dc:date>2022-10-01T10:44:05Z</dc:date>
    </item>
    <item>
      <title>Re: How do I distribute tasks in the IML action</title>
      <link>https://communities.sas.com/t5/SAS-IML-Software-and-Matrix/How-do-I-distribute-tasks-in-the-IML-action/m-p/836279#M5872</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;
&lt;P&gt;( this is in addition to my answer right above , although I think&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13684"&gt;@Rick_SAS&lt;/a&gt;&amp;nbsp;just&amp;nbsp;"cycled in between" &lt;span class="lia-unicode-emoji" title=":winking_face:"&gt;😉&lt;/span&gt;)&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;I can't resist pointing out to you (and all other interested readers) other possibilities of parallelization.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The first two articles in the list below explain you about MP Connect.&lt;/P&gt;
&lt;P&gt;I use MP Connect all the time (also in SAS VIYA 4).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The last two show that you can even parallelize (further) within PROC CAS.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Running SAS programs in parallel using SAS/CONNECT® &lt;BR /&gt;By Leonid Batkhan on SAS Users January 13, 2021&lt;BR /&gt;&lt;A href="https://blogs.sas.com/content/sgf/2021/01/13/running-sas-programs-in-parallel-using-sas-connect/" target="_blank"&gt;https://blogs.sas.com/content/sgf/2021/01/13/running-sas-programs-in-parallel-using-sas-connect/&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Base SAS + SAS/CONNECT - A simple method to generate load on any number of licensed cores&lt;BR /&gt;Posted 04-08-2021 05:46 AM | by SimonWilliams (4316 views)&lt;BR /&gt;&lt;A href="https://communities.sas.com/t5/SAS-Communities-Library/Base-SAS-SAS-CONNECT-A-simple-method-to-generate-load-on-any/ta-p/732174" target="_blank"&gt;https://communities.sas.com/t5/SAS-Communities-Library/Base-SAS-SAS-CONNECT-A-simple-method-to-generate-load-on-any/ta-p/732174&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Using SYSTASK and SAS macro loops for massively parallel processing&lt;BR /&gt;By Leonid Batkhan on SAS Users June 14, 2021&lt;BR /&gt;&lt;A href="https://blogs.sas.com/content/sgf/2021/06/14/using-systask-and-sas-macro-loops-for-massively-parallel-processing/" target="_blank"&gt;https://blogs.sas.com/content/sgf/2021/06/14/using-systask-and-sas-macro-loops-for-massively-parallel-processing/&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Tips for parallel processing in CASL&lt;BR /&gt;by RICKY THARRINGTON on JULY 6, 2021&lt;BR /&gt;&lt;A href="https://blogs.sas.com/content/subconsciousmusings/2021/07/06/tips-for-parallel-processing-in-casl/" target="_blank"&gt;https://blogs.sas.com/content/subconsciousmusings/2021/07/06/tips-for-parallel-processing-in-casl/&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Parallel Processing in SAS Viya&lt;BR /&gt;by RICKY THARRINGTON on MAY 25, 2021 &lt;BR /&gt;&lt;A href="https://blogs.sas.com/content/subconsciousmusings/2021/05/25/parallel-processing-in-sas-viya/" target="_blank"&gt;https://blogs.sas.com/content/subconsciousmusings/2021/05/25/parallel-processing-in-sas-viya/&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Greetings,&lt;BR /&gt;Koen&lt;/P&gt;</description>
      <pubDate>Sat, 01 Oct 2022 10:48:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-IML-Software-and-Matrix/How-do-I-distribute-tasks-in-the-IML-action/m-p/836279#M5872</guid>
      <dc:creator>sbxkoenk</dc:creator>
      <dc:date>2022-10-01T10:48:53Z</dc:date>
    </item>
    <item>
      <title>Re: How do I distribute tasks in the IML action</title>
      <link>https://communities.sas.com/t5/SAS-IML-Software-and-Matrix/How-do-I-distribute-tasks-in-the-IML-action/m-p/842331#M5885</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13684"&gt;@Rick_SAS&lt;/a&gt;&amp;nbsp;I've resolved it.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;What had taken a long time before (if it finished execution at all), now takes only 2 minutes. I use 16 threads and it works like a charm.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Without comments I paste my code.&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc cas;
loadactionset "iml";
source DefineMods;
/* SAS/IML function that computes N random numbers and appends
   node and thread information */

start acFunnel(LL);
x=LL$'vv'; 
help=LL$'comb';

qnummese=help[,2:ncol(help)];
y=help[,1];

XT=(SHAPE(DIF(X), NROW(X)) [, 2:NCOL(X)])#2;
idx=loc((x[, 2:NCOL(X)]+xt)=1);
xt[idx]=1;
t2=unique(qnummese);

t22=(repeat(t2, nrow(x))=qnummese)`*(xt = 2);

t22_sum=t22 [+];
ctr=(xt = 2) [+]; 

idx1=loc(t22 &amp;lt;&amp;gt; 0);
s = ndx2sub(dimension(t22), idx1);

res0=j(nrow(s), ncol(x)-1, .);
res0_perc=res0;

do i=1 to ncol(idx1);
idx2=loc( ( (qnummese=t2[s[i,1]])# (xt[, s[i,2]]=2) )=1);
/* PRINT IDX2; */
res0[i,s[i,2]] = t22[idx1[i]];
/* PRINT RES0; */
res0[i,(s[i,2]+1):ncol(xt)]=(x [loc(element(y[,1], y[idx2,1])),(s[i,2]+1):ncol(x)-1] &amp;lt;&amp;gt;0) [+,];
/* PRINT RES0; */
res0_perc[i,(s[i,2]):ncol(xt)]=res0[i,(s[i,2]):ncol(xt)]/res0[i,s[i,2]];
end;

   y =  res0||s;
   return y;
finish;

start acSimParTasks(labl, opt, L);
   cuts = L$'ord';  vals = L$'xx'; idler=L$'idcomb'; varnames=L$'varn';
namer="L1":"L16";

L1= [#'vv'=vals[1:cuts[1]-1,],            #'comb'=idler[1:cuts[1]-1,] ];
L2= [#'vv'=vals[cuts[1]:cuts[2]-1,],      #'comb'=idler[cuts[1]:cuts[2]-1,]];
L3= [#'vv'=vals[cuts[2]:cuts[3]-1,],      #'comb'=idler[cuts[2]:cuts[3]-1,]];
L4= [#'vv'=vals[cuts[3]:cuts[4]-1,],      #'comb'=idler[cuts[3]:cuts[4]-1,]];
L5= [#'vv'=vals[cuts[4]:cuts[5]-1,] ,     #'comb'=idler[cuts[4]:cuts[5]-1,]];
L6= [#'vv'=vals[cuts[5]:cuts[6]-1,] ,     #'comb'=idler[cuts[5]:cuts[6]-1,]];
L7= [#'vv'=vals[cuts[6]:cuts[7]-1,] ,     #'comb'=idler[cuts[6]:cuts[7]-1,]];
L8= [#'vv'=vals[cuts[7]:cuts[8]-1,] ,     #'comb'=idler[cuts[7]:cuts[8]-1,]];
L9= [#'vv'=vals[cuts[8]:cuts[9]-1,] ,     #'comb'=idler[cuts[8]:cuts[9]-1,]] ;
L10=[#'vv'=vals[cuts[9]:cuts[10]-1,] ,    #'comb'=idler[cuts[9]:cuts[10]-1,]];
L11=[#'vv'=vals[cuts[10]:cuts[11]-1,] ,   #'comb'=idler[cuts[10]:cuts[11]-1,]];
L12=[#'vv'=vals[cuts[11]:cuts[12]-1,] ,   #'comb'=idler[cuts[11]:cuts[12]-1,]];
L13=[#'vv'=vals[cuts[12]:cuts[13]-1,] ,   #'comb'=idler[cuts[12]:cuts[13]-1,]];
L14=[#'vv'=vals[cuts[13]:cuts[14]-1,] ,   #'comb'=idler[cuts[13]:cuts[14]-1,]];
L15=[#'vv'=vals[cuts[14]:cuts[15]-1,] ,   #'comb'=idler[cuts[14]:cuts[15]-1,]];
L16=[#'vv'=vals[cuts[15]:cuts[16],] ,     #'comb'=idler[cuts[15]:cuts[16],]];


   Tasks = repeat('acFunnel', 1, 16);  
   Args = [L1, L2, L3, L4, L5, L6, L7, L8, L9, L10, L11, L12, L13, L14, L15, L16];      
   Results = ParTasks(Tasks, Args, opt);
   free M;
   do i = 1 to ListLen(Results);
      M = M // Results$i;
   end;

s=M[, ncol(M)-1:ncol(M)];
MM=M[,1:ncol(M)-2];
qnummese=idler[,2:ncol(idler)];
t2=unique(qnummese);

rown=catx("|", t2 [s[,1]], varnames[s[,2]]);

call MatrixWriteToCAS(MM, '', '_crm_fun', varnames);
/* call MatrixWriteToCAS(res0_perc, '', '_crm_fun_perc', varnames); */
call MatrixWriteToCAS(rown, '', '_crm_fun_id');

/*    varNames = {'mean' 'std' 'min' 'max'}; */
/*    print M[L=labl  F=Best6.]; */
finish;

store module=(acFunnel acSimParTasks);
endsource;
iml / code=DefineMods;
run;

cas mysession sessopts=(caslib="casuser");
/* try your best */
options casdatalimit=all;
proc cas;
loadactionset "iml";
source RandMR;
/* Run MapReduce on all workers and threads */
KeepStmt1 = 'KEEP=codidoc codopera  ';
KeepStmt2 = 'KEEP=_numeric_  ';
KeepStmt3 = 'KEEP=QNUMMESE  ';
x = matrixCreateFromCAS('PUBLIC', 'exp2', KeepStmt2 );
y=matrixCreateFromCAS('PUBLIC', 'EXP2', KeepStmt1);
QNUMMESE = matrixCreateFromCAS('PUBLIC', 'EXP2', KeepStmt3);

start TasksPerThread(nT, nW, i);
   n = floor(nT / nW) + (i &amp;lt;= mod(nT, nW));
   return(n);
finish;

nTasks = nrow(y);
nThreads = 16;
i = T(1:nThreads);
n = TasksPerThread(nTasks, nThreads, i);
Thread = char(T(1:nThreads));
/* print n[c={'Num Tasks'} r=Thread L='Tasks per Thread']; */

varnames={&amp;amp;var2s.};

X=X||J(NROW(X),1,0);
XT=(SHAPE(DIF(X), NROW(X)) [, 2:NCOL(X)])#2;
idx=loc((x[, 2:NCOL(X)]+xt)=1);
xt[idx]=1;

y=y[,2]||y[,1];

t2=unique(qnummese);

acum_n=cusum(n);
/* print acum_n; */
call sortndx(idx, y, 1 );
y=y[idx,];
x=x[idx,];
QNUMMESE=QNUMMESE[idx,];

op=y[,1];

acum_ok=acum_n[1:nrow(n)-1];
do i=1 to nrow(n)-1;
flag_ok=(op[acum_n[i]-1]^=op[acum_n[i]]);
do j=1 to 200 until(flag_ok);
flag_ok=(op[acum_n[i]-1+j]^=op[acum_n[i]+j]);

end;
acum_ok[i]=acum_n[i]+j;
end;

help=(acum_ok-1)||(acum_ok) || (acum_ok+1);
help2= rowvec(help)`;
help2=({1}//help2)`;

acum_ok=acum_ok//acum_n[nrow(n)];
op=y[,1]||QNUMMESE;



    load module=(acFunnel acSimParTasks);
       L = [#'xx'=x, #'ord'=acum_ok, #'idcomb'=op, #'varn'=varnames]; 
    run acSimParTasks('ParTasks: Threads First', {2 0}, L );

endsource;
iml / code=RandMR nthreads=16;


run;

run;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 03 Nov 2022 15:11:06 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-IML-Software-and-Matrix/How-do-I-distribute-tasks-in-the-IML-action/m-p/842331#M5885</guid>
      <dc:creator>acordes</dc:creator>
      <dc:date>2022-11-03T15:11:06Z</dc:date>
    </item>
    <item>
      <title>Re: How do I distribute tasks in the IML action</title>
      <link>https://communities.sas.com/t5/SAS-IML-Software-and-Matrix/How-do-I-distribute-tasks-in-the-IML-action/m-p/842341#M5886</link>
      <description>&lt;P&gt;&lt;EM&gt;&amp;gt;&amp;nbsp;What had taken a long time before (if it finished execution at all), now takes only 2 minutes&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Congratulations. Glad to hear you were successful!&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 03 Nov 2022 15:28:26 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-IML-Software-and-Matrix/How-do-I-distribute-tasks-in-the-IML-action/m-p/842341#M5886</guid>
      <dc:creator>Rick_SAS</dc:creator>
      <dc:date>2022-11-03T15:28:26Z</dc:date>
    </item>
  </channel>
</rss>

