BookmarkSubscribeRSS Feed
UrvishShah
Fluorite | Level 6

Hi All

Actually i've 50 lacks observations to sort it

And if  i'm  using Proc Sort tecnique to sort the data it is taking to much time to sort it

Sometime it is taking 30 min or more CPU Time

Can anybody suggest me the best solution to sort the data so that i can done my task very faster in case of low space ?

Which SAS options should i use to optimise the Proc Sort ? 

Thanks

10 REPLIES 10
art297
Opal | Level 21

I think you've asked this before.  5,000,000 is not a large number of records for SAS and the sort time should be measured in seconds not minutes.

Some questions:

What kind of computer are you using?

How much free disk space to you have?

How much memory does your system have?

How many variables are in the file?

How many variables are you sorting on?

UrvishShah
Fluorite | Level 6

Hi

There are about to sort 2 variables

Dataset capacity is 3 GB

Observation number is more than 1.5 million

System capacity is 4 GB in specific drive where only we are working 

And actual time is more than 3 to 4 hrs.

UrvishShah
Fluorite | Level 6

And total variables are 30

UrvishShah
Fluorite | Level 6

Can anybody assist me with regard to Proc Sort Optimisation ?

art297
Opal | Level 21

Urvish,

I think your need for optimization is resulting from the size of your file.  I've been using SAS for more than 37 years, but don't have an answer for everything yet.  However, I'm intrigued by your question, thus am going to repost it in another thread here, as well as on SAS-L.

One can easily create a file that only contains the variables you are sorting on, number the records, sort the file based on those two variables, and then resort the file back to its original order.

The time for doing such a task is extremely less than the time required to sort the entire file.

However, what I don't know, is whether one can then use the new file to establish an index for the original file.

I'll let you know what I find out.

art297
Opal | Level 21

I don't have an answer yet, but have been offered one option that might be a lead: the tagsort option on proc sort.  I'm skeptical as the documentation states that it might INCREASE processing time, but the description is doing what I was suggesting.  I don't have time to test it, but it might be worth a try.

SASKiwi
PROC Star

A good place to start when investigating such problems is the SAS Support site. Search using the key words PROC SORT PERFORMANCE and you will get many useful links.

If you are using SAS under Windows this might be helpful:

http://support.sas.com/documentation/cdl/en/hostwin/63047/HTML/default/viewer.htm#n0ea63jfjic0vpn15d...

art297
Opal | Level 21

Thanks, but I had already looked there.  Using tagsort, on my system at least, will reduce the cpu time from 2.6 seconds, to about 1.6 seconds.  However, only using the required variables, reduced the time to about 0.2 seconds.  What I'm looking for is a way to capitalize on that.

Ksharp
Super User

Another alternate method is  Hash Table. But I am not sure how fast it will be.

UrvishShah
Fluorite | Level 6

Thanks all

And i'll try options suggested by you all

Regards

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 3514 views
  • 0 likes
  • 4 in conversation