12-14-2011 02:00 PM
Actually i've 50 lacks observations to sort it
And if i'm using Proc Sort tecnique to sort the data it is taking to much time to sort it
Sometime it is taking 30 min or more CPU Time
Can anybody suggest me the best solution to sort the data so that i can done my task very faster in case of low space ?
Which SAS options should i use to optimise the Proc Sort ?
12-14-2011 02:27 PM
I think you've asked this before. 5,000,000 is not a large number of records for SAS and the sort time should be measured in seconds not minutes.
What kind of computer are you using?
How much free disk space to you have?
How much memory does your system have?
How many variables are in the file?
How many variables are you sorting on?
12-14-2011 02:52 PM
There are about to sort 2 variables
Dataset capacity is 3 GB
Observation number is more than 1.5 million
System capacity is 4 GB in specific drive where only we are working
And actual time is more than 3 to 4 hrs.
12-14-2011 06:39 PM
I think your need for optimization is resulting from the size of your file. I've been using SAS for more than 37 years, but don't have an answer for everything yet. However, I'm intrigued by your question, thus am going to repost it in another thread here, as well as on SAS-L.
One can easily create a file that only contains the variables you are sorting on, number the records, sort the file based on those two variables, and then resort the file back to its original order.
The time for doing such a task is extremely less than the time required to sort the entire file.
However, what I don't know, is whether one can then use the new file to establish an index for the original file.
I'll let you know what I find out.
12-14-2011 07:08 PM
I don't have an answer yet, but have been offered one option that might be a lead: the tagsort option on proc sort. I'm skeptical as the documentation states that it might INCREASE processing time, but the description is doing what I was suggesting. I don't have time to test it, but it might be worth a try.
12-14-2011 07:19 PM
A good place to start when investigating such problems is the SAS Support site. Search using the key words PROC SORT PERFORMANCE and you will get many useful links.
If you are using SAS under Windows this might be helpful:
12-14-2011 07:23 PM
Thanks, but I had already looked there. Using tagsort, on my system at least, will reduce the cpu time from 2.6 seconds, to about 1.6 seconds. However, only using the required variables, reduced the time to about 0.2 seconds. What I'm looking for is a way to capitalize on that.