BookmarkSubscribeRSS Feed
olabuh
Calcite | Level 5
I tried running proc freq on a dataset with 15000000 observations and the program froze. I had to restart it and this happened many times. My RAM is 8gb, do I need to upgrade the ram to run large dataset ?
8 REPLIES 8
Tom
Super User Tom
Super User

How do you know it was frozen? 

15,000,000 observations is not that many, but it is a large number of distinct categories for PROC FREQ.

 

Another thing that make SAS appear to be "frozen" is if it is just waiting for you to complete your program.  So if you are missing a semi-colon or a RUN: statement. Or have mismatched quotes or unbalanced parentheses it could be that the SAS compiler is just waiting for you to submit the rest of the code for the step before it compiles and runs it.

 

olabuh
Calcite | Level 5
I am sure I put in the correct code. The process started running but never completed the process for over an hour . The previous code I ran took 6 minutes .
ErikLund_Jensen
Rhodochrosite | Level 12

Hi @olabuh 

 

Have you checked if the process is frozen, i.e. doing nothing, or actually running? - Find the SAS process in Task Manager, right-click on any column header and select the 3 I/O - byte columns, and follow it for 10 seconds to see if there are any movements in CPU or these columns. If not, the process is frozen, and you can (and should) kill it in task manager.

 

And free as many ressources as possible before you run your job. Close other memory-consuming apps like MS Office products.

 

You might have one or more "dead" sas processes holding ressources. besides the one you are working in. Kill them too to free ressources. If you put this line in your program before your Proc xxx, you get the PID written to the log before the session freezes, so you can identify the actual process in Task Manager, if there are more SAS processes running:

 

%put &=sysJobID;

 

And before you invest in more RAM, see if your SAS is configured to use all available ressources. Unfortunately that is beyond my knowledge, but I am sure there are many experts out there, if you start another thread with that topic.

Ksharp
Super User

As Tom said, ten million is not a big data for sas.And PROC FREQ is multi-session PROC.

You need add NOPRINT option.

 

ods select none;
proc freq data=sashelp.class noprint;
table age/out=want nopercent nocum;
run;
ods select all;
VDD
Ammonite | Level 13 VDD
Ammonite | Level 13

your code mush have issues in it.

I run a Freq on 680,000,000 + records each month which does take about 40 minutes to run.  Look in the directory where the work is being done to see if the output results table has a lock file version which is being updated. 

 

Test the code with a small sample to ensure it works before running the Freq on the large dataset.

 

Tom
Super User Tom
Super User

Make sure that you are generating frequency on a variable where it makes sense. Like a categorical variable with less than 1,000 unique values.  For example don't run frequencies on variables like a unique observation ID or a DATE that is going to generate 15 million distinct values.

 

Turn off any ODS outputs.  Writing a really large results to ODS can take a long time.

 

Or better use the NOPRINT and OUT= options on your TABLES statement instead.

Reeza
Super User
I've use PROC FREQ on a computer with 8GB with 20 million records and it completes. You do not need more RAM. Show your code, and a screenshot of your frozen 'status'.

Are you running for example on one variable or all variables? If you're running it on all variables and one is an ID you have a single table of 1's but it takes time to count all that properly. But right now, no idea what's happening with the information you've provided.
ballardw
Super User

@Reeza wrote:
I've use PROC FREQ on a computer with 8GB with 20 million records and it completes. You do not need more RAM. Show your code, and a screenshot of your frozen 'status'.

Are you running for example on one variable or all variables? If you're running it on all variables and one is an ID you have a single table of 1's but it takes time to count all that properly. But right now, no idea what's happening with the information you've provided.

And LOTS of time trying to build a default html table holding that many rows to show in the result window. Plus if there are multiple variables with high cardinality (numbers of values) like income, street address, account number, purchase order identifier and similar each of those tables may be other resource hogs.

 

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 3857 views
  • 5 likes
  • 7 in conversation