<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Maximum number of variables allowed using SAS in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Maximum-number-of-variables-allowed-using-SAS/m-p/701106#M214660</link>
    <description>&lt;P&gt;&lt;SPAN&gt;I just created a data set with one million variables.&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Mon, 23 Nov 2020 23:02:20 GMT</pubDate>
    <dc:creator>ChrisNZ</dc:creator>
    <dc:date>2020-11-23T23:02:20Z</dc:date>
    <item>
      <title>Maximum number of variables allowed using SAS</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Maximum-number-of-variables-allowed-using-SAS/m-p/701066#M214629</link>
      <description>&lt;P&gt;Historically, the SAS PDV only permitted about 32k variables. Has this changed in recent years? If so, what is the current maximum number of variables allowed by the PDV?&lt;/P&gt;</description>
      <pubDate>Mon, 23 Nov 2020 21:10:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Maximum-number-of-variables-allowed-using-SAS/m-p/701066#M214629</guid>
      <dc:creator>xtc283x</dc:creator>
      <dc:date>2020-11-23T21:10:37Z</dc:date>
    </item>
    <item>
      <title>Re: Maximum number of variables allowed using SAS</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Maximum-number-of-variables-allowed-using-SAS/m-p/701075#M214637</link>
      <description>&lt;P&gt;If you get close to running out of variable allocations you need to seriously consider what you/how you are doing something.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I just created an empty data set with 400,000 numeric variables.&lt;/P&gt;</description>
      <pubDate>Mon, 23 Nov 2020 21:35:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Maximum-number-of-variables-allowed-using-SAS/m-p/701075#M214637</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2020-11-23T21:35:35Z</dc:date>
    </item>
    <item>
      <title>Re: Maximum number of variables allowed using SAS</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Maximum-number-of-variables-allowed-using-SAS/m-p/701084#M214646</link>
      <description>&lt;P&gt;Good to know that the PDV permits more than 32k variables but that still doesn't pin the precise upper bound.&lt;/P&gt;&lt;P&gt;With all due respect, things have changed a lot in recent years wrt any 'norm' about the max number of features, e.g., algorithms with millions, even billions, of parameters are pretty routine in the ML community.&lt;/P&gt;</description>
      <pubDate>Mon, 23 Nov 2020 21:58:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Maximum-number-of-variables-allowed-using-SAS/m-p/701084#M214646</guid>
      <dc:creator>xtc283x</dc:creator>
      <dc:date>2020-11-23T21:58:46Z</dc:date>
    </item>
    <item>
      <title>Re: Maximum number of variables allowed using SAS</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Maximum-number-of-variables-allowed-using-SAS/m-p/701096#M214653</link>
      <description>&lt;P&gt;SAS Viya goes beyond some of the constraints of traditional SAS (SAS Release 9) also. If you want further guidance then tell us more about your use case.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 23 Nov 2020 22:24:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Maximum-number-of-variables-allowed-using-SAS/m-p/701096#M214653</guid>
      <dc:creator>SASKiwi</dc:creator>
      <dc:date>2020-11-23T22:24:50Z</dc:date>
    </item>
    <item>
      <title>Re: Maximum number of variables allowed using SAS</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Maximum-number-of-variables-allowed-using-SAS/m-p/701106#M214660</link>
      <description>&lt;P&gt;&lt;SPAN&gt;I just created a data set with one million variables.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 23 Nov 2020 23:02:20 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Maximum-number-of-variables-allowed-using-SAS/m-p/701106#M214660</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2020-11-23T23:02:20Z</dc:date>
    </item>
    <item>
      <title>Re: Maximum number of variables allowed using SAS</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Maximum-number-of-variables-allowed-using-SAS/m-p/701107#M214661</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/36940"&gt;@xtc283x&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;Good to know that the PDV permits more than 32k variables but that still doesn't pin the precise upper bound.&lt;/P&gt;
&lt;P&gt;With all due respect, things have changed a lot in recent years wrt any 'norm' about the max number of features, e.g., algorithms with millions, even billions, of parameters are pretty routine in the ML community.&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&amp;nbsp;With all due respect "parameters" are not the same as "variables". Proc Iml and matrix could be considered a single "parameter" for some operations, with the matrix containing 1000 rows of 1000 variables (not stating that as any limit but an example) for 1,000,000 values that could be considered parameters by something else. So the use case is somewhat important. &lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I killed my system when trying to create 4,000,000 variables (just for giggles) because after a few minutes it was still running. 1,000,000 variables took a bit over 4.5 seconds; 2,000,000 took about 1 minute and 50 seconds.&lt;/P&gt;</description>
      <pubDate>Mon, 23 Nov 2020 23:03:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Maximum-number-of-variables-allowed-using-SAS/m-p/701107#M214661</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2020-11-23T23:03:53Z</dc:date>
    </item>
    <item>
      <title>Re: Maximum number of variables allowed using SAS</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Maximum-number-of-variables-allowed-using-SAS/m-p/701108#M214662</link>
      <description>&lt;P&gt;And there are many other ways of storing information that doesn't require data in that form, so there are other options such as pairing SAS with Hadoop or other big data technologies. Data storage, analysis/processing and modeling are not limited by the same things.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/36940"&gt;@xtc283x&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;Good to know that the PDV permits more than 32k variables but that still doesn't pin the precise upper bound.&lt;/P&gt;
&lt;P&gt;With all due respect, things have changed a lot in recent years wrt any 'norm' about the max number of features, e.g., algorithms with millions, even billions, of parameters are pretty routine in the ML community.&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 23 Nov 2020 23:08:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Maximum-number-of-variables-allowed-using-SAS/m-p/701108#M214662</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2020-11-23T23:08:22Z</dc:date>
    </item>
    <item>
      <title>Re: Maximum number of variables allowed using SAS</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Maximum-number-of-variables-allowed-using-SAS/m-p/701124#M214672</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13884"&gt;@ballardw&lt;/a&gt;&amp;nbsp;Yes, the cost seems to be CPU-bound, and grow exponentially rather than linearly, which might point to a defect in the underlying logic.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;100k =&amp;gt; 0.5 s&lt;/P&gt;
&lt;P&gt;1m =&amp;gt; 10 s&lt;/P&gt;
&lt;P&gt;1.5m =&amp;gt; 1 min&lt;/P&gt;
&lt;P&gt;2m =&amp;gt;&amp;nbsp; &amp;nbsp;4 min&lt;/P&gt;
&lt;P&gt;3m =&amp;gt; 18 min&lt;/P&gt;</description>
      <pubDate>Tue, 24 Nov 2020 03:36:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Maximum-number-of-variables-allowed-using-SAS/m-p/701124#M214672</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2020-11-24T03:36:45Z</dc:date>
    </item>
    <item>
      <title>Re: Maximum number of variables allowed using SAS</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Maximum-number-of-variables-allowed-using-SAS/m-p/701144#M214681</link>
      <description>&lt;P&gt;For the mapping of variable names to PDV locations, the interpreter needs to build a search tree during data step compilation, and adding to such things always grows exponentially.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;That's why temporary arrays are the fastest constructs in a data step: no individual names, addressing solely through index.&lt;/P&gt;</description>
      <pubDate>Tue, 24 Nov 2020 05:24:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Maximum-number-of-variables-allowed-using-SAS/m-p/701144#M214681</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2020-11-24T05:24:53Z</dc:date>
    </item>
    <item>
      <title>Re: Maximum number of variables allowed using SAS</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Maximum-number-of-variables-allowed-using-SAS/m-p/701151#M214685</link>
      <description>&lt;P&gt;I asked tech support to have a look at this phenomenon. I'll report if anything interesting comes up.&lt;/P&gt;</description>
      <pubDate>Tue, 24 Nov 2020 06:03:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Maximum-number-of-variables-allowed-using-SAS/m-p/701151#M214685</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2020-11-24T06:03:22Z</dc:date>
    </item>
    <item>
      <title>Re: Maximum number of variables allowed using SAS</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Maximum-number-of-variables-allowed-using-SAS/m-p/701407#M214787</link>
      <description>&lt;P&gt;Something interesting came out of the conversation with tech support.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The SAS interpreter guesses how many variables will be needed as output for the table when the output buffer (not the PDV: this data set output buffer creation logic is used by procedures as well) is created. The reason this estimation is necessary is that some procedures request way too many variables (as in: millions!).&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;When this estimated size is exceeded, the estimation process reruns&lt;STRONG&gt; for every variable added&lt;/STRONG&gt;. There is no plan to fix this at the moment since this has never caused real-world issues. This explains the heavy CPU load.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Notes:&lt;/P&gt;
&lt;P&gt;- In a &lt;FONT face="courier new,courier"&gt;data _null_&lt;/FONT&gt; step, this issue does not appear as no output buffer is created (but a PDV is).&lt;/P&gt;
&lt;P&gt;- I&amp;nbsp;created a table with 8 million variables; this takes about 8 hours and uses 4GB of RAM, which is the maximum I have access to.&lt;/P&gt;
&lt;P&gt;- &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/11650"&gt;@SimonDawson&lt;/a&gt;&amp;nbsp;can tell you more if you are interested.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 25 Nov 2020 00:53:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Maximum-number-of-variables-allowed-using-SAS/m-p/701407#M214787</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2020-11-25T00:53:17Z</dc:date>
    </item>
    <item>
      <title>Re: Maximum number of variables allowed using SAS</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Maximum-number-of-variables-allowed-using-SAS/m-p/701414#M214794</link>
      <description>&lt;P&gt;The key sentence here:&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;SPAN&gt;since this has never caused real-world issues&lt;/SPAN&gt;&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;</description>
      <pubDate>Wed, 25 Nov 2020 01:39:10 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Maximum-number-of-variables-allowed-using-SAS/m-p/701414#M214794</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2020-11-25T01:39:10Z</dc:date>
    </item>
    <item>
      <title>Re: Maximum number of variables allowed using SAS</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Maximum-number-of-variables-allowed-using-SAS/m-p/701421#M214798</link>
      <description>The absence of 'real world issues' in a SAS world may not be due to the demand for millions and/or billions of features, parameters or variables since Google engineers are running algorithms of that magnitude on a routine basis, and more due SAS' inability to deliver and perform at that level.&lt;BR /&gt;Just saying...&lt;BR /&gt;</description>
      <pubDate>Wed, 25 Nov 2020 02:21:57 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Maximum-number-of-variables-allowed-using-SAS/m-p/701421#M214798</guid>
      <dc:creator>xtc283x</dc:creator>
      <dc:date>2020-11-25T02:21:57Z</dc:date>
    </item>
    <item>
      <title>Re: Maximum number of variables allowed using SAS</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Maximum-number-of-variables-allowed-using-SAS/m-p/701424#M214800</link>
      <description>&lt;P&gt;&lt;EM&gt;&amp;gt; Just saying...&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;Partly true, chicken and eggs....&lt;/P&gt;
&lt;P&gt;Having said that such large models are typically run in memory, so a fairer comparison would be to look at CAS data. Not too sure what's happening there.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 25 Nov 2020 03:08:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Maximum-number-of-variables-allowed-using-SAS/m-p/701424#M214800</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2020-11-25T03:08:31Z</dc:date>
    </item>
    <item>
      <title>Re: Maximum number of variables allowed using SAS</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Maximum-number-of-variables-allowed-using-SAS/m-p/701793#M214925</link>
      <description>&lt;P&gt;I guess you confuse the &lt;EM&gt;number&lt;/EM&gt; of data items that a data or proc step can handle simultaneously with the need to define a &lt;EM&gt;name&lt;/EM&gt; for each individual item.&lt;/P&gt;
&lt;P&gt;As other languages, SAS provides constructs for this. The fastest is a temporary array, and hash objects are similarly quick, and only limited by the memory available to the SAS session:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data _null_;
length n value 8;
declare hash h ();
h.definekey("n");
h.definedata("value");
h.definedone();
do n = 1 to 10000000;
  value = rand("uniform") * 1000;
  h.add();
end;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Log:&lt;/P&gt;
&lt;PRE&gt;27         data _null_;
28         length n value 8;
29         declare hash h ();
30         h.definekey("n");
31         h.definedata("value");
32         h.definedone();
33         do n = 1 to 10000000;
34           value = rand("uniform") * 1000;
35           h.add();
36         end;
37         run;

NOTE:  Verwendet wurde: DATA statement - (Gesamtverarbeitungszeit):
      real time           9.70 seconds
      cpu time            2.56 seconds
      

&lt;/PRE&gt;
&lt;P&gt;In just 10 seconds, the step created 20 million numeric values, and built the search tree for one of them.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This on a 2-core pSeries server with a MEMSIZE of 512M.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 26 Nov 2020 10:52:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Maximum-number-of-variables-allowed-using-SAS/m-p/701793#M214925</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2020-11-26T10:52:33Z</dc:date>
    </item>
  </channel>
</rss>

