BookmarkSubscribeRSS Feed
yaswoman
Calcite | Level 5
Hi I have a huge data set 915K records for which I want to do a PROC FREQ on just one field which is 90 bytes long. I seem to be running out space for some reason. Can any offer any help/suggestions on how I might get around this.

NOTE: The SAS System stopped processing this step because of insufficient memory.
I am trying to output this to another dataset with a OUT statement still no joy.

thanks so much in advance.
6 REPLIES 6
Ksharp
Super User
Hi.
You can try to create an index for this vaiable.
I am not sure it can work.Just a suggestion.


Ksharp
yaswoman
Calcite | Level 5
thanks. Can you please explain further. Not sure if I know how to do that.

thanks.
darrylovia
Quartz | Level 8
There are a number of options that you could try.

1) sort the data set by your variable of interest. then use PROC SUMMARY with just a by statement and no var statement. PROC SUMMARY will output a field called _freq_ in the output dataset.

2) use proc sql with a group by and see what happens.

Darryl
Doc_Duke
Rhodochrosite | Level 12
Darryl's response should work.

The reason that you were running out of space is that the memory needs of FREQ are a function of the number of DISTINCT values of the variable times it's length (the reference manual should have the exact formula). With that many observations and a text field, you'll could have lots.

The SORTing and SUMMARY are disk space dependent. SQL is a mix, as it tries to put as much in memory as it can and then relies on disk. Both will take longer than FREQ would have done if you had enough memory.

Doc Muhlbaier
Duke
yaswoman
Calcite | Level 5
thank you both Darryl and Duke. Will try your suggestions.

Much appreciated.
data_null__
Jade | Level 19
This is basically the same idea as suggest already.

I don't know how long it would take to sort the data. You might want to do sort the data in groups then combine and count. Be sure to keep only the variable that needs counting. Should save a lot ot time if the data set has lots of variables.

[pre]
proc sort data=sashelp.shoes(keep=Subsidiary firstobs=1 obs=200) out=bin1;
by Subsidiary;
run;
proc sort data=sashelp.shoes(keep=Subsidiary firstobs=201 obs=max) out=bin2;
by Subsidiary;
run;

data freq;
do Frequency=1 by 1 until(last.Subsidiary);
set bin1 bin2;
by Subsidiary;
end;
CumulativeFrequency + Frequency;
run;
proc print;
run;
[/pre]

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 4027 views
  • 0 likes
  • 5 in conversation