BookmarkSubscribeRSS Feed
somebody
Lapis Lazuli | Level 10

I am running a PROC MEANS on a large dataset (100gb) and keeps getting errors of insufficient memory. I read this article about in-database processing which could be my solution but do not really know how to implement it. Does anyone know how to deal with this issue?

http://support.sas.com/documentation/cdl/en/proc/61895/HTML/default/viewer.htm#a003331709.htm

8 REPLIES 8
Astounding
PROC Star

Running out of memory?  It's possible that in-database processing will help.  Assuming that you have in-database processing available, you would simply have to switch from PROC MEANS to PROC HPSUMMARY.  The syntax is pretty much the same.

 

It's also possible that you can control this without running in-database.  Show us the PROC MEANS step that you are trying to run.  Also (and this is unlikely if running in-database is even a possibility), is there a sorted order to your data?

PeterClemmensen
Tourmaline | Level 20

Where and how is this data set located?

 

Try specifying 

 

options sqlgeneration="dbms";

before your PROC MEANS run.

PGStats
Opal | Level 21

The documentation does say:

 

"In-database processing can greatly reduce the volume of data transferred to the procedure if there are no class variables (one row is returned) or if the selected class variables have a small number of unique values. However, because PROC MEANS loads the result set into its internal structures, the memory requirements for the SAS process will be equivalent to what would have been required without in-database processing."

 

Switching from CLASS to BY processing would most likely reduce memory requirements, but your data would need to be properly sorted or indexed.

PG
somebody
Lapis Lazuli | Level 10

thanks, but i read somewhere that using BY is more sufficient with large datasets, and so I am confused

 

PGStats
Opal | Level 21

BY processing is more efficient.

 

What are YOU doing?

PG
SASKiwi
PROC Star

Post your code. Memory requirements vary depending on the number of unique values of your CLASS statement variables. Do you know how many unique values you have?

PaigeMiller
Diamond | Level 26

Not only should you post your code, but 100gb is meaningless in this context. We need to know the number of observations, and the number of variables that you are computing means for, and probably the number of BY groups.

--
Paige Miller
ballardw
Super User

@somebody wrote:

thanks, but i read somewhere that using BY is more sufficient with large datasets, and so I am confused

 


And sometimes directing the output to a data set instead of the output/results window helps if you are generating lots of output in a table.

But since we haven't seen any actual code or log specific.

 

See this code:

proc sort data=sashelp.class 
   out=work.class;
   by age;
run;

proc means data=work.class;
   by age;
run;

proc means data=work.class;
   class age;
run;

Notice that the resulting displayed tables in the Results window take more "space". The results window tries to accumulate everything into memory to create the output tables. As a minimum the repeated header rows for each by groups adds to the memory requirement.

 

If you have a largish number of other variables coupled with many requested statistics and many values of the by variables you might be hitting the display memory limit.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1133 views
  • 0 likes
  • 7 in conversation