BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Phil_NZ
Barite | Level 11

Hi all SAS Experts,

I have been struggling with a proc sort step in my SAS code, could you please help me to sort it out?

 

My code and error is

NOTE: There were 162871058 observations read from the data set WORK.EX_NON_TRADING.


61 proc sort data=ex_non_trading_; 62 by gviidkey datadate; 63 run; ERROR: No disk space is available for the write operation. Filename = C:\Users\pnguyen\AppData\Roaming\SAS\EnterpriseGuide\EGTEMP\SEG-19596-3f184c1f\contents\SAS Temporary Files\SAS_util000100004A74_it082760\ut4A74000005.utl. ERROR: Failure while attempting to write page 1085 of sorted run 106. ERROR: Failure while attempting to write page 304325 to utility file 1. ERROR: Failure encountered while creating initial set of sorted runs. ERROR: Failure encountered during external sort. ERROR: Sort execution failure. NOTE: The SAS System stopped processing this step because of errors. NOTE: There were 145088137 observations read from the data set WORK.EX_NON_TRADING_. WARNING: The data set WORK.EX_NON_TRADING_ may be incomplete. When this step was stopped there were 0 observations and 26 variables. WARNING: Data set WORK.EX_NON_TRADING_ was not replaced because this step was stopped.

So, you can see the file ex_non_trading has 162871058 obs but it was damaged and reduced to only 145088137 obs after proc sort. 

I really need your help and appreciate any contribution.

 

Warmest regards,

Phil.

Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.
1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

With large datasets try adding the TAGSORT option to the Proc Sort statement.

 

The TAGSORT option in the PROC SORT statement is useful in sorts when there might not be enough disk space to sort a large SAS data set. When you specify TAGSORT, the sort is a single-threaded sort. Do not specify TAGSORT if you want the SAS to use multiple threads to sort.
When you specify the TAGSORT option, only sort keys (that is, the variables specified in the BY statement) and the observation number for each observation are stored in the temporary files. The sort keys, together with the observation number, are referred to as tags. At the completion of the sorting process, the tags are used to retrieve the records from the input data set in sorted order. Thus, in cases where the total number of bytes of the sort keys is small compared with the length of the record, temporary disk use is reduced considerably. You should have enough disk space to hold another copy of the data (the output data set) or two copies of the tags, whichever is greater. Note that while using the TAGSORT option can reduce temporary disk use, the processing time can be much higher. However, on PCs with limited available disk space, the TAGSORT option can allow sorts to be performed in situations where they would otherwise not be possible.

View solution in original post

20 REPLIES 20
Reeza
Super User
Talk to your IT team or someone internal to your company because I suspect this is a process issue. You've previously stated you're on a server but it appears you're running it locally (C drive reference) so if you run it on the server instead you shouldn't have any issues. 162 million records is too big for a desktop machine. Not sure why/how you're running locally though.



Phil_NZ
Barite | Level 11

Hi @Reeza 

 

I did talk to the IT guys and he confirmed that I am running on a server rather than locally. I am quite surprised about that as well.

 

Warmest regards.

Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.
Reeza
Super User

Your log is stating otherwise....has your IT person seen that?

 

Anyways, you're a paid user, contact tech support if your IT team won't help you. This isn't something we can really help with. In general for sorting you need 3X the space of a data set, one to hold the main data set, a second to hold a temporary copy as it sorts and a third to hold the final version. Some algorithms don't work that way, but in general you're going to need a ton of space to sort a data set that size. 

 

You can change where sorts happen to help with this, but again you may need to change your config file or use a user/temp library and that depends on your set up. 

 

Phil_NZ
Barite | Level 11

Hi @Reeza 

Many thanks for your explanation and suggestion,@Reeza. 

I learnt a lot about the mechanism behind the code from you.

Warmest regards,

Phil.

Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.
SASKiwi
PROC Star

If you want to know where your SAS is running in EG, in the Server List right click on the server name and select Properties. The Properties window will tell you the name of your SAS server. If you are running SAS locally on your PC then your server name is likely to be Local rather than SASApp.

Phil_NZ
Barite | Level 11

Hi @SASKiwi 

Sorry for replying late, I just got sick these days and just recover. Thank you for your suggestion and following your guidance, I have a Properties as below.

Does it mean that I am running locally rather than on a server?

My97_0-1616278957859.png

 

Warm regards,

Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.
SASKiwi
PROC Star

@Phil_NZ  - Correct. All of your SAS processing is on your PC. I suggest you open Windows Explorer on your PC and check the amount of free space you have on your local drives (C, D etc). That will tell you how big a SAS dataset you can cope with.

ballardw
Super User

With large datasets try adding the TAGSORT option to the Proc Sort statement.

 

The TAGSORT option in the PROC SORT statement is useful in sorts when there might not be enough disk space to sort a large SAS data set. When you specify TAGSORT, the sort is a single-threaded sort. Do not specify TAGSORT if you want the SAS to use multiple threads to sort.
When you specify the TAGSORT option, only sort keys (that is, the variables specified in the BY statement) and the observation number for each observation are stored in the temporary files. The sort keys, together with the observation number, are referred to as tags. At the completion of the sorting process, the tags are used to retrieve the records from the input data set in sorted order. Thus, in cases where the total number of bytes of the sort keys is small compared with the length of the record, temporary disk use is reduced considerably. You should have enough disk space to hold another copy of the data (the output data set) or two copies of the tags, whichever is greater. Note that while using the TAGSORT option can reduce temporary disk use, the processing time can be much higher. However, on PCs with limited available disk space, the TAGSORT option can allow sorts to be performed in situations where they would otherwise not be possible.
Phil_NZ
Barite | Level 11

Hi @ballardw 

Thank you for your insightful idea, the only problem of tagsort is that it takes more time to process comparing to proc sort without tagsort from my understanding. So, it seems to be a good idea that I let the computer run with tagsort at night when I am sleeping.

 

Warm regards.

Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.
Kurt_Bremser
Super User

TAGSORT is mainly used with large datasets that are stored with the COMPRESS option and a serious compression factor. Since the utility file will only contain the BY variables and the observation pointer, it is much smaller than the usual (uncompressed!) utility file that contains all columns.

 

Phil_NZ
Barite | Level 11

Hi @Kurt_Bremser 

Your suggestion confused me a little bit. So, TAGSORT is only good with the compressed datasets (by using options compress=yes). So, do you also mean the large dataset without being compressed by compress option will not work that well with TAGSORT?

 

Please let me know if I explain you wrongly.

 

Warmest regards.

Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.
SASKiwi
PROC Star

@Phil_NZ  - From what I understand you have a remote SAS server available to you at your university, so why not switch to using it for dealing with large datasets rather than wasting a lot of time trying to getting them to fit on a PC?

Phil_NZ
Barite | Level 11

Hi @SASKiwi 

Yes, from your explanation, I recognize that I am using the local machine to run a large dataset. I see it as a chance for me to learn something further about how to deal with a large dataset with limited memory. And now, the IT department in our school is dealing with people moving from COVID restriction level 3 to level 1, so I better learned by myself quite a bit and develop my knowledge as well.

 

Thank you for your dedicated and unconditional helping so far.

 

Warmest regards,

Phil

Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.
SASKiwi
PROC Star

@Phil_NZ  - Understood. BTW you are studying at my old university 😀

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 20 replies
  • 5290 views
  • 21 likes
  • 8 in conversation