Hello,
When I'm trying to sort a table with 70Go I've the "DISK full" message...
But I've a WORK with 903Go 😞
Usually i'm expecting that a SORT will not use above 70Go * 3 of space
Do you have an idea why i'm facing this kind of pb?
The table is like this :
100 561 233 rows
270 columns
Observation Length : 14 284
Table is compressed (CHAR)
To circumvent the pb, i'm currently trying TAGSORT, I'will also trying to split the data in 10 sub-tables.
The utility file of a proc sort is not compressed, so you can deduct its size from the size of the file and the compression factor shown in the log when you create it.
Depending on the contents, your compression ratio could well be above 90%, and then it is no surprise you run out of space.
Using tagsort is the right remedy.
I'd also consider adding additional disk space that is physically separate from your WORK, and set UTILLOC to it. That prevents concurrent read and write on the WORK disks.
There are probably other datasets in WORK from previous steps and/or from parallel sessions running at same time.
One of the solutions (maybe not the best) is to split the 70GB dataset into several smaller ones,
sort eache saparetly and finally merge them back.
What is 70Go?
Some things to check, do you have anything else in Work?
How are you sorting the data?
Where is the data residing, is it on a network or locally?
thank u for your message.
size is 70Gb (gigas)
I'v nothing else in my work : 'im monitoring the work every minute with df ('im under AIX) and I can see only my WORK Folder growing, growing...
The table is already in the WORK and I want to sort it .
Are you sure you have access to full size? IT can often limit work space.
Perhaps this article will help - particularly sortsize:
Not sure how you plan on working with that data, its a big chunk anyways you look at it (i wouldn't personally be happy with 270 columns - anymore than 20 or so is difficult programming wise).
Are you sure you have access to full size? IT can often limit work space.
Also, what's the exact error you receive?
The utility file of a proc sort is not compressed, so you can deduct its size from the size of the file and the compression factor shown in the log when you create it.
Depending on the contents, your compression ratio could well be above 90%, and then it is no surprise you run out of space.
Using tagsort is the right remedy.
I'd also consider adding additional disk space that is physically separate from your WORK, and set UTILLOC to it. That prevents concurrent read and write on the WORK disks.
Indeed,
I have a ratio of 95.43 percent (!?)
70Gb * 95.43 percent.= something below 700, but 700 into * 3 > 900Gb WORK folder.
tagsort did the job :
NOTE: PROCEDURE SORT used (Total process time):
real time 1:41:46.94
user cpu time 27:57.53
system cpu time 4:36.74
memory 266864.43k
OS Memory 272352.00k
Actually, your compressed file has less than 5% of its uncompressed size. So the uncompressed file would be roughly 20*70 GB in size, and that amounts to 1.4 TB(!).
When sorting large compressed datasets with a considerable compression rate (>80%), I always use TAGSORT, just to prevent a disk full condition in my UTILLOC.
For big table, I would like to split it into small tables and combine them together later. Like:
data F M;
set sashelp.class;
if sex='F' then output F;
else if sex='M' then output M;
run;
data want;
set F M;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.