<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: CAS Data Step Hash Table: How many copies are there? in SAS Viya</title>
    <link>https://communities.sas.com/t5/SAS-Viya/CAS-Data-Step-Hash-Table-How-many-copies-are-there/m-p/905985#M2128</link>
    <description>&lt;P&gt;I've received an answer via another channel so just adding here this info as well:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;If the hash is "read only" then a single instance of the hash table will get created for the whole process (even if multiple worker nodes it's still a single instance).&lt;/LI&gt;
&lt;LI&gt;If the hash is "read/write" then there will be a copy per thread - which can be a lot. In the environment I'm currently "playing" processes run with up to 192 threads.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;If a hash is read or read/write is decided during compilation time. Any hash method in the code that can modify the data - like add() - will lead to an instance per thread.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Mon, 04 Dec 2023 12:53:53 GMT</pubDate>
    <dc:creator>Patrick</dc:creator>
    <dc:date>2023-12-04T12:53:53Z</dc:date>
    <item>
      <title>CAS Data Step Hash Table: How many copies are there?</title>
      <link>https://communities.sas.com/t5/SAS-Viya/CAS-Data-Step-Hash-Table-How-many-copies-are-there/m-p/904610#M2114</link>
      <description>&lt;P&gt;Hi all,&lt;/P&gt;
&lt;P&gt;When running a data step multithreaded in CAS using a hash table: How many times does the hash table get loaded into memory?&lt;/P&gt;
&lt;P&gt;Once per thread? Or once per worker or even only once at all with the controller doing some smarts?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;My current understanding is that it would need to be once per thread but if that's not the case then I'd be happy to be wrong - and eager to learn how things are working.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Here some self-contained working code I used for my testing where I can see that the data step with the hash lookup runs multithreaded on multiple workers (in my environment) and returns the expected result.&lt;/P&gt;
&lt;LI-SPOILER&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;options msglevel=i;
cas mysess cassessopts=(caslib="casuser");

libname casuser cas;

data casuser.class(copies=0);
  set sashelp.class end=last;
  do i=1 to 300;
    output;
  end;
  if last then output;
  drop i;
run;

data casuser.base_table(copies=0 replace=yes);
  length row_id_1 $20;
  row_id_1=catx('_',_threadid_,_n_);
  threadid_1=_threadid_;
  hostname_1=_hostname_;
  set casuser.class;
run;

data casuser.lookup_table(duplicate=yes replace=yes);
  set sashelp.class;
  if name in ('Alfred','Judy','William');
run;

data casuser.result(replace=yes);
  if _n_=1 then
    do;
      dcl hash h1(dataset:'casuser.lookup_table');
      h1.defineKey('name');
      h1.defineDone();
    end;
  length row_id_2 $20 threadid_2 8 hostname_2 $20;
  set casuser.base_table;
  if h1.check() =0;
  row_id_2=catx('_',_threadid_,_n_);
  threadid_2=_threadid_;
  hostname_2=_hostname_;
run;

proc freq data=casuser.result;
  table threadid_2*hostname_2 /nocol norow nocum nopercent;
  table threadid_2*name /nocol norow nocum nopercent;
  table hostname_2*name/nocol norow nocum nopercent;
  table name/nocol norow nocum nopercent;
run;

/* proc print data=casuser.result; */
/* run; */

cas mysess terminate;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;/LI-SPOILER&gt;
&lt;P&gt;And here the first freq from above code that shows me that the data step with the hash lookup runs in my environment on 4 workers with 3 threads per worker.&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Patrick_0-1701082837201.png" style="width: 556px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/90208i60ECA3EF7CD7B3E5/image-dimensions/556x456?v=v2" width="556" height="456" role="button" title="Patrick_0-1701082837201.png" alt="Patrick_0-1701082837201.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/3174"&gt;@DerylHollick&lt;/a&gt;&amp;nbsp;,&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/21262"&gt;@hashman&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 27 Nov 2023 11:24:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Viya/CAS-Data-Step-Hash-Table-How-many-copies-are-there/m-p/904610#M2114</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2023-11-27T11:24:30Z</dc:date>
    </item>
    <item>
      <title>Re: CAS Data Step Hash Table: How many copies are there?</title>
      <link>https://communities.sas.com/t5/SAS-Viya/CAS-Data-Step-Hash-Table-How-many-copies-are-there/m-p/905985#M2128</link>
      <description>&lt;P&gt;I've received an answer via another channel so just adding here this info as well:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;If the hash is "read only" then a single instance of the hash table will get created for the whole process (even if multiple worker nodes it's still a single instance).&lt;/LI&gt;
&lt;LI&gt;If the hash is "read/write" then there will be a copy per thread - which can be a lot. In the environment I'm currently "playing" processes run with up to 192 threads.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;If a hash is read or read/write is decided during compilation time. Any hash method in the code that can modify the data - like add() - will lead to an instance per thread.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 04 Dec 2023 12:53:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Viya/CAS-Data-Step-Hash-Table-How-many-copies-are-there/m-p/905985#M2128</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2023-12-04T12:53:53Z</dc:date>
    </item>
  </channel>
</rss>

