<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: PROC MEANS with large dataset (100gb) in SAS Procedures</title>
    <link>https://communities.sas.com/t5/SAS-Procedures/PROC-MEANS-with-large-dataset-100gb/m-p/470753#M70911</link>
    <description>&lt;P&gt;Not only should you post your code, but 100gb is meaningless in this context. We need to know the number of observations, and the number of variables that you are computing means for, and probably the number of BY groups.&lt;/P&gt;</description>
    <pubDate>Sat, 16 Jun 2018 01:18:16 GMT</pubDate>
    <dc:creator>PaigeMiller</dc:creator>
    <dc:date>2018-06-16T01:18:16Z</dc:date>
    <item>
      <title>PROC MEANS with large dataset (100gb)</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/PROC-MEANS-with-large-dataset-100gb/m-p/470709#M70903</link>
      <description>&lt;P&gt;I am running a PROC MEANS on a large dataset (100gb) and keeps getting errors of insufficient memory. I read this article about in-database processing which could be my solution but&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;do&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;not really know how to implement it. Does anyone know how to deal with this issue?&lt;/P&gt;&lt;P&gt;&lt;A href="http://support.sas.com/documentation/cdl/en/proc/61895/HTML/default/viewer.htm#a003331709.htm" target="_blank"&gt;http://support.sas.com/documentation/cdl/en/proc/61895/HTML/default/viewer.htm#a003331709.htm&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 15 Jun 2018 21:21:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/PROC-MEANS-with-large-dataset-100gb/m-p/470709#M70903</guid>
      <dc:creator>somebody</dc:creator>
      <dc:date>2018-06-15T21:21:46Z</dc:date>
    </item>
    <item>
      <title>Re: PROC MEANS with large dataset (100gb)</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/PROC-MEANS-with-large-dataset-100gb/m-p/470714#M70904</link>
      <description>&lt;P&gt;Running out of memory?&amp;nbsp; It's possible that in-database processing will help.&amp;nbsp; Assuming that you have in-database processing available, you would simply have to switch from PROC MEANS to PROC HPSUMMARY.&amp;nbsp; The syntax is pretty much the same.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;It's also possible that you can control this without running in-database.&amp;nbsp; Show us the PROC MEANS step that you are trying to run.&amp;nbsp; Also (and this is unlikely if running in-database is even a possibility), is there a sorted order to your data?&lt;/P&gt;</description>
      <pubDate>Fri, 15 Jun 2018 21:39:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/PROC-MEANS-with-large-dataset-100gb/m-p/470714#M70904</guid>
      <dc:creator>Astounding</dc:creator>
      <dc:date>2018-06-15T21:39:13Z</dc:date>
    </item>
    <item>
      <title>Re: PROC MEANS with large dataset (100gb)</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/PROC-MEANS-with-large-dataset-100gb/m-p/470715#M70905</link>
      <description>&lt;P&gt;Where and how is this data set located?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Try specifying&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;options sqlgeneration="dbms";&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;before your PROC MEANS run.&lt;/P&gt;</description>
      <pubDate>Fri, 15 Jun 2018 21:39:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/PROC-MEANS-with-large-dataset-100gb/m-p/470715#M70905</guid>
      <dc:creator>PeterClemmensen</dc:creator>
      <dc:date>2018-06-15T21:39:34Z</dc:date>
    </item>
    <item>
      <title>Re: PROC MEANS with large dataset (100gb)</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/PROC-MEANS-with-large-dataset-100gb/m-p/470719#M70906</link>
      <description>&lt;P&gt;The documentation does say:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;"In-database processing can greatly reduce the volume of data transferred to the procedure if there are no class variables (one row is returned) or if the selected class variables have a small number of unique values. However, because PROC MEANS loads the result set into its internal structures, the memory requirements for the SAS process will be equivalent to what would have been required without in-database processing&lt;/EM&gt;."&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Switching&amp;nbsp;from CLASS to BY processing would most likely reduce memory requirements, but your data&amp;nbsp;would need to be&amp;nbsp;properly sorted or indexed.&lt;/P&gt;</description>
      <pubDate>Fri, 15 Jun 2018 22:04:18 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/PROC-MEANS-with-large-dataset-100gb/m-p/470719#M70906</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2018-06-15T22:04:18Z</dc:date>
    </item>
    <item>
      <title>Re: PROC MEANS with large dataset (100gb)</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/PROC-MEANS-with-large-dataset-100gb/m-p/470720#M70907</link>
      <description>&lt;P&gt;thanks, but i read somewhere that using BY is more sufficient with large datasets, and so I am confused&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 15 Jun 2018 22:05:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/PROC-MEANS-with-large-dataset-100gb/m-p/470720#M70907</guid>
      <dc:creator>somebody</dc:creator>
      <dc:date>2018-06-15T22:05:37Z</dc:date>
    </item>
    <item>
      <title>Re: PROC MEANS with large dataset (100gb)</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/PROC-MEANS-with-large-dataset-100gb/m-p/470722#M70908</link>
      <description>&lt;P&gt;BY processing is more efficient.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;What are YOU doing?&lt;/P&gt;</description>
      <pubDate>Fri, 15 Jun 2018 22:08:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/PROC-MEANS-with-large-dataset-100gb/m-p/470722#M70908</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2018-06-15T22:08:22Z</dc:date>
    </item>
    <item>
      <title>Re: PROC MEANS with large dataset (100gb)</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/PROC-MEANS-with-large-dataset-100gb/m-p/470740#M70910</link>
      <description>&lt;P&gt;Post your code. Memory requirements vary depending on the number of unique values of your CLASS statement variables. Do you know how many unique values you have?&lt;/P&gt;</description>
      <pubDate>Fri, 15 Jun 2018 23:09:10 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/PROC-MEANS-with-large-dataset-100gb/m-p/470740#M70910</guid>
      <dc:creator>SASKiwi</dc:creator>
      <dc:date>2018-06-15T23:09:10Z</dc:date>
    </item>
    <item>
      <title>Re: PROC MEANS with large dataset (100gb)</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/PROC-MEANS-with-large-dataset-100gb/m-p/470753#M70911</link>
      <description>&lt;P&gt;Not only should you post your code, but 100gb is meaningless in this context. We need to know the number of observations, and the number of variables that you are computing means for, and probably the number of BY groups.&lt;/P&gt;</description>
      <pubDate>Sat, 16 Jun 2018 01:18:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/PROC-MEANS-with-large-dataset-100gb/m-p/470753#M70911</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2018-06-16T01:18:16Z</dc:date>
    </item>
    <item>
      <title>Re: PROC MEANS with large dataset (100gb)</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/PROC-MEANS-with-large-dataset-100gb/m-p/471218#M70919</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/98381"&gt;@somebody&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;thanks, but i read somewhere that using BY is more sufficient with large datasets, and so I am confused&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;And sometimes directing the output to a data set instead of the output/results window helps if you are generating lots of output in a table.&lt;/P&gt;
&lt;P&gt;But since we haven't seen any actual code or log specific.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;See this code:&lt;/P&gt;
&lt;PRE&gt;proc sort data=sashelp.class 
   out=work.class;
   by age;
run;

proc means data=work.class;
   by age;
run;

proc means data=work.class;
   class age;
run;&lt;/PRE&gt;
&lt;P&gt;Notice that the resulting displayed tables in the Results window take more "space". The results window tries to accumulate everything into memory to create the output tables. As a minimum the repeated header rows for each by groups adds to the memory requirement.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you have a largish number of other variables coupled with many requested statistics and many values of the by variables you might be hitting the display memory limit.&lt;/P&gt;</description>
      <pubDate>Mon, 18 Jun 2018 19:55:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/PROC-MEANS-with-large-dataset-100gb/m-p/471218#M70919</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2018-06-18T19:55:15Z</dc:date>
    </item>
  </channel>
</rss>

