<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: how many datasets we can use in set and merge statements in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/how-many-datasets-we-can-use-in-set-and-merge-statements/m-p/731145#M227754</link>
    <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/12447"&gt;@Patrick&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;....&lt;/P&gt;
&lt;P&gt;I'd advice to re-visit any design where you have to deal with a double digit of tables at once.&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;I agree this is a preferable state of affairs.&amp;nbsp; However, we have had a very significant collection of data that we have very good justification to deal with 20 to 30 tables - sometimes 200.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In particular we have two SAS datasets for each day of trading on the major American stocks exchanges.&amp;nbsp; One dataset (the actual trades) will have hundreds of millions of records (with timestamps to the nanosecond) per day - every day has exactly the same 15 or so variables.&amp;nbsp; The other ("quotes" - i.e. offers to buy or sell) will have an order of magnitude more records, but is almost as skinny - about 20 variables.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The trade dataset name is CTyyyymmdd and the quotes is named CQyyyymmdd.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Frequently a user needs to process a month of data - i.e. they will use a &lt;EM&gt;&lt;STRONG&gt;SET CT201206:&lt;/STRONG&gt;&lt;/EM&gt; to read in trades for the month of June in 2012.&amp;nbsp; The issue becomes a bit problematic when the user gets piggy and try to get data for a whole year - about 200 datasets (200 trading days).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;But the problem is usually related to memory, which is often solved by using the "open=defer" option of SET - which tells SAS to re-use the same input buffer for each incoming data set, rather than the default of one buffer per dataset.&amp;nbsp; Of course this benefits from knowing that each dataset has the same collection of variables.&amp;nbsp; &amp;nbsp;"open=defer" will not honor introduction of new variables after the first dataset in the list.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Sat, 03 Apr 2021 15:14:20 GMT</pubDate>
    <dc:creator>mkeintz</dc:creator>
    <dc:date>2021-04-03T15:14:20Z</dc:date>
    <item>
      <title>how many datasets we can use in set and merge statements</title>
      <link>https://communities.sas.com/t5/SAS-Programming/how-many-datasets-we-can-use-in-set-and-merge-statements/m-p/731128#M227746</link>
      <description>&lt;P&gt;data ex1;&lt;/P&gt;
&lt;P&gt;set table1 table2 table3 table4 ..........tablen;&lt;/P&gt;
&lt;P&gt;run;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;data ex2;&lt;/P&gt;
&lt;P&gt;merrge table1 table2 table3 table4 ..........tablen;&lt;/P&gt;
&lt;P&gt;run;&lt;/P&gt;
&lt;P&gt;In above two steps how many data sets we can write like table1, table2....upto ?&lt;/P&gt;
&lt;P&gt;I think 256 datasets we can use. Kindly let me know&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 03 Apr 2021 12:17:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/how-many-datasets-we-can-use-in-set-and-merge-statements/m-p/731128#M227746</guid>
      <dc:creator>thanikondharish</dc:creator>
      <dc:date>2021-04-03T12:17:56Z</dc:date>
    </item>
    <item>
      <title>Re: how many datasets we can use in set and merge statements</title>
      <link>https://communities.sas.com/t5/SAS-Programming/how-many-datasets-we-can-use-in-set-and-merge-statements/m-p/731129#M227747</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;As far as I know by heart, the 256-table limit is for PROC SQL.&lt;/P&gt;
&lt;P&gt;I do not think the set and merge statement in the data-step have an explicit limit w.r.t. number of tables and if there's a limit it's for sure bigger than 256.&lt;/P&gt;
&lt;P&gt;But this is very easy to test of course!!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Cheers,&lt;/P&gt;
&lt;P&gt;Koen&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 03 Apr 2021 12:48:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/how-many-datasets-we-can-use-in-set-and-merge-statements/m-p/731129#M227747</guid>
      <dc:creator>sbxkoenk</dc:creator>
      <dc:date>2021-04-03T12:48:05Z</dc:date>
    </item>
    <item>
      <title>Re: how many datasets we can use in set and merge statements</title>
      <link>https://communities.sas.com/t5/SAS-Programming/how-many-datasets-we-can-use-in-set-and-merge-statements/m-p/731137#M227748</link>
      <description>&lt;P&gt;Using below sample code for testing I've encountered an error starting with ds #7513 (not caused by running out of disk space).&lt;/P&gt;
&lt;P&gt;I consider your question to be rather academic though and I'd advice to re-visit any design where you have to deal with a double digit of tables at once.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data _null_;
  if 0 then set sashelp.class(keep=name);
  dcl hash h1(dataset:'sashelp.class(where=(name="Alfred"))');
  h1.defineKey('name');
  h1.defineData('name');
  h1.defineDone();
  do i=1 to 2000;
    h1.output(dataset:cats('tbl_',put(i,z6.)));
  end;
  stop;
run;

data test;
  set tbl_:;
run;

proc datasets lib=work nolist nodetails;
  delete tbl_:;
  run;
quit;

proc contents data=test;
run;quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Patrick_0-1617458368042.png" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/56875iACC2D95C9401983E/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Patrick_0-1617458368042.png" alt="Patrick_0-1617458368042.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 03 Apr 2021 14:03:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/how-many-datasets-we-can-use-in-set-and-merge-statements/m-p/731137#M227748</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2021-04-03T14:03:00Z</dc:date>
    </item>
    <item>
      <title>Re: how many datasets we can use in set and merge statements</title>
      <link>https://communities.sas.com/t5/SAS-Programming/how-many-datasets-we-can-use-in-set-and-merge-statements/m-p/731145#M227754</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/12447"&gt;@Patrick&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;....&lt;/P&gt;
&lt;P&gt;I'd advice to re-visit any design where you have to deal with a double digit of tables at once.&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;I agree this is a preferable state of affairs.&amp;nbsp; However, we have had a very significant collection of data that we have very good justification to deal with 20 to 30 tables - sometimes 200.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In particular we have two SAS datasets for each day of trading on the major American stocks exchanges.&amp;nbsp; One dataset (the actual trades) will have hundreds of millions of records (with timestamps to the nanosecond) per day - every day has exactly the same 15 or so variables.&amp;nbsp; The other ("quotes" - i.e. offers to buy or sell) will have an order of magnitude more records, but is almost as skinny - about 20 variables.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The trade dataset name is CTyyyymmdd and the quotes is named CQyyyymmdd.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Frequently a user needs to process a month of data - i.e. they will use a &lt;EM&gt;&lt;STRONG&gt;SET CT201206:&lt;/STRONG&gt;&lt;/EM&gt; to read in trades for the month of June in 2012.&amp;nbsp; The issue becomes a bit problematic when the user gets piggy and try to get data for a whole year - about 200 datasets (200 trading days).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;But the problem is usually related to memory, which is often solved by using the "open=defer" option of SET - which tells SAS to re-use the same input buffer for each incoming data set, rather than the default of one buffer per dataset.&amp;nbsp; Of course this benefits from knowing that each dataset has the same collection of variables.&amp;nbsp; &amp;nbsp;"open=defer" will not honor introduction of new variables after the first dataset in the list.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 03 Apr 2021 15:14:20 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/how-many-datasets-we-can-use-in-set-and-merge-statements/m-p/731145#M227754</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2021-04-03T15:14:20Z</dc:date>
    </item>
    <item>
      <title>Re: how many datasets we can use in set and merge statements</title>
      <link>https://communities.sas.com/t5/SAS-Programming/how-many-datasets-we-can-use-in-set-and-merge-statements/m-p/731146#M227755</link>
      <description>&lt;P&gt;You must have a small machine.&amp;nbsp; I was able to use your program to build/read 20,000 tiny datasets.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Note you can reduce the amount of memory used by adding the OPEN=DEFER option to the SET statement.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;options fullstimer ;
data test;
  set tbl_000001-tbl_001000;
run;

data test;
  length name $8;
  set tbl_000001-tbl_001000 open=defer;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Here are the two FULLSTIMER outputs:&lt;/P&gt;
&lt;PRE&gt;NOTE: DATA statement used (Total process time):
      real time           0.25 seconds
      user cpu time       0.26 seconds
      system cpu time     0.00 seconds
      memory              192604.68k
      OS Memory           446336.00k
      Timestamp           04/03/2021 11:12:47 AM
      Step Count                        18  Switch Count  0
      Page Faults                       1
      Page Reclaims                     34273
      Page Swaps                        0
      Voluntary Context Switches        5
      Involuntary Context Switches      4
      Block Input Operations            352
      Block Output Operations           392


NOTE: DATA statement used (Total process time):
      real time           0.13 seconds
      user cpu time       0.13 seconds
      system cpu time     0.00 seconds
      memory              2285.09k
      OS Memory           145772.00k
      Timestamp           04/03/2021 11:12:48 AM
      Step Count                        19  Switch Count  0
      Page Faults                       0
      Page Reclaims                     45
      Page Swaps                        0
      Voluntary Context Switches        1
      Involuntary Context Switches      0
      Block Input Operations            0
      Block Output Operations           264&lt;/PRE&gt;</description>
      <pubDate>Sat, 03 Apr 2021 15:21:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/how-many-datasets-we-can-use-in-set-and-merge-statements/m-p/731146#M227755</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2021-04-03T15:21:48Z</dc:date>
    </item>
    <item>
      <title>Re: how many datasets we can use in set and merge statements</title>
      <link>https://communities.sas.com/t5/SAS-Programming/how-many-datasets-we-can-use-in-set-and-merge-statements/m-p/731182#M227767</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/31461"&gt;@mkeintz&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I was more thinking about merge/join when I made my statement. You're of course right that stacking of transactional/daily data is not that uncommon and can easily involve &amp;gt;10 tables.&lt;/P&gt;
&lt;P&gt;I've never fully appreciated the impact on memory option open=defer has. Very valuable to keep in mind. Thanks for that!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/159"&gt;@Tom&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The limiting factor for me was the default setting of MEMMAXSZ (2GB). Once I've increased this value I could process more tables.&lt;/P&gt;</description>
      <pubDate>Sun, 04 Apr 2021 00:53:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/how-many-datasets-we-can-use-in-set-and-merge-statements/m-p/731182#M227767</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2021-04-04T00:53:13Z</dc:date>
    </item>
  </channel>
</rss>

