BookmarkSubscribeRSS Feed
kbaughma
Calcite | Level 5

I have run into memory issues using Proc Mix. I cannot include as many random effects as I would like without getting a message about insufficient memory. I am using SAS 9.4 and have been using the memory=MAX command which has given me some more memory but not enough. I would like to buy a new computer that is more powerful but am unsure what to buy. My current computer is a Dell 64 bit operating system with Intel(R) Core(TM) i7 CPU, 960@3.20GHZ with 12.0 GB of RAM. It has Windows 7 Professional. Will it help to buy a more powerful computer? I would like to know what others are using and if they run into simialr memory issues.

thanks!

14 REPLIES 14
lvm
Rhodochrosite | Level 12 lvm
Rhodochrosite | Level 12

I presume you mean PROC MIXED. You should use HPMIXED. This should handle your memory problems. Most of the syntax is the same, although there are far fewer options with HPMIXED.

http://support.sas.com/resources/papers/proceedings09/256-2009.pdf

 

The mixed model equations can consist of some very large matrices; inverting them takes a great deal of time and memory when there are many random effects. I highly recommend that you figure out how to use HPMIXED.

kbaughma
Calcite | Level 5

Thanks, I will look into that procedure.

FreelanceReinh
Jade | Level 19

Yes, I also ran into an unexpected memory issue recently, with this innocuous PROC MEANS step:

proc means data=tmt mean min median;
var dttm;
run;

Log message: "A shortage of memory has caused the quantile computations to terminate prematurely for QMETHOD=OS. ..."

Dataset TMT had about 39.5 million observations.

 

This happened on a Windows 7 Pro 64-Bit workstation with an Intel(R) Xeon(TM) E5-1630v3 3.7GHz 10M CPU and 64 GB DDR4-2133 RAM. However, only about 14 GB RAM were available to SAS at that time, because I am using a RAM disk software which combines 50 GB RAM with 100 GB of the 1st 256-GB SSD to form a 150-GB hybrid RAM disk. I am curious whether the issue would still occur if using the full 64 GB of RAM, but haven't tried yet.

 

My first idea would have been to upgrade the RAM of your computer (if this is possible), but lvm's suggestion about PROC HPMIXED sounds very promising.

 

lvm
Rhodochrosite | Level 12 lvm
Rhodochrosite | Level 12

There are several "HP" (high performance) procedures now. Run on single machine or in distributed mode. There is no HPMEANS, but there is HPSUMMARY. This might work for your purpose of getting quantiles with very large data sets. With 40 M observations, quantiles will be difficult to get without the tricks of large-scale computing. At some point, you would need to get the distributed computing products.

 

For some procedures, SAS 9.4 won't allow the (non-HP) procedure to run if it will take too much time. This is frustrating. I have 9.3 and 9.4 on my desktop, and I can fit a mixed model on a large data set with 9.3 (taking many hours), but in 9.4 I just get a message that it would take too long to run.

lvm
Rhodochrosite | Level 12 lvm
Rhodochrosite | Level 12

Out of curiosity, I just simulated 50 million observations and determined the median and quartiles with PROC HPSUMMARY. No problem. I did this with "only" 8 GB of memory and a slow processor.Took less than 1/2 second of real time.  It is important to use the P2 method of quantile estimation (approximation). The default (OS) requires internal ordering of the observations, which is a challenge with so many observations.

proc hpsummary data=a qmethod=p2;
var y;
output out=out q1=q1 q3=q3  median=median  mean=mean;
run;
proc print data=out;run;
lvm
Rhodochrosite | Level 12 lvm
Rhodochrosite | Level 12
Correction: it took about 11 sec of real time on 50 M records.
Rick_SAS
SAS Super FREQ

You can also use the QMETHOD=P2 option directly from PROC MEANS.

lvm
Rhodochrosite | Level 12 lvm
Rhodochrosite | Level 12

Good point. It is interesting that it takes about the same amount of time with MEANS as with HPSUMMARY to get quartiles (P2 option) on 50 M observations (on my desktop).

FreelanceReinh
Jade | Level 19

Many thanks to both @kbaughma and @lvm!

 

kbaughma's initial post reminded me of the performance-related system options. I discovered that the MEMSIZE option on my machine was still set to its default of 2147483648 (=2G), which meant that only a small portion of my 14 GB (or 64 GB after deactivating the RAM disk) had been available to SAS.

 

By simply setting MEMSIZE to MAX (during startup) my previously failed PROC MEANS step (see earlier post in this thread) ran without problems.

 

And much more: Now an ordinary PROC SUMMARY was able to cope with randomly generated 640 million observations (4.84 GB dataset) and calculated mean, min and, above all, median within less than 10 minutes -- without forcing me to resort to QMETHOD=P2 and its fluctuating results.

 

After this breakthrough I tried to push the limit even further and found that PROC HPSUMMARY achieved the same with 700 million observations (5.29 GB, 11 minutes, peak physical memory usage at about 53 GB), whereas PROC SUMMARY failed.

 

However, with 720 million obs. the old warning reappeared with either procedure.

 

So, the improvement by PROC HPSUMMARY over PROC SUMMARY -- in single-machine mode! -- in terms of processable numbers of observations was somewhere between 0 and 12.5 percent. There seemed to be no significant difference regarding run time. Of course, in distributed mode a completely different picture is to be expected.

Ameurgen
Obsidian | Level 7

Hello guys , 

in the same topic, i need how can i estimate the total memory required for running PROC MIXED. there is a specific formula.

thank you 

Rick_SAS
SAS Super FREQ

The documentation for many of the classic SAS/STAT procedures includes a section called "Computational Issues," which often includes a discussion of memory requirements. For PROC MIXED, see SAS Help Center: Computational Issues

 

JoséQ
SAS Employee

You might also want to consider using PROC HPMIXED, it is designed for models with a large X and/or Z matrix if this is your case here.

Ameurgen
Obsidian | Level 7

thank you all,

i know about hpmixed solution but it is a just curiosity to know what is the technique applied for this and what is the reason of this computation issue with proc mixed and memory requirements. 

i need the formula or simple explaination for that , i found a many porposition and i don't know where is the correct one. 

 

Thank you again

lvm
Rhodochrosite | Level 12 lvm
Rhodochrosite | Level 12

The specific details for PROC MIXED are here:

https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.3/statug/statug_mixed_details58.htm#statug_mixe...

 

Memory determination depends on a lot of things. For example, how one specifies the model can make a big difference. A statement such as

random A A*B;

can use more memory (and be slower to fit) than using:

random int B / sub=A;

because the latter is processed by subjects. 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 14 replies
  • 3558 views
  • 11 likes
  • 6 in conversation