Hi All
Actually i've 50 lacks observations to sort it
And if i'm using Proc Sort tecnique to sort the data it is taking to much time to sort it
Sometime it is taking 30 min or more CPU Time
Can anybody suggest me the best solution to sort the data so that i can done my task very faster in case of low space ?
Which SAS options should i use to optimise the Proc Sort ?
Thanks
I think you've asked this before. 5,000,000 is not a large number of records for SAS and the sort time should be measured in seconds not minutes.
Some questions:
What kind of computer are you using?
How much free disk space to you have?
How much memory does your system have?
How many variables are in the file?
How many variables are you sorting on?
Hi
There are about to sort 2 variables
Dataset capacity is 3 GB
Observation number is more than 1.5 million
System capacity is 4 GB in specific drive where only we are working
And actual time is more than 3 to 4 hrs.
And total variables are 30
Can anybody assist me with regard to Proc Sort Optimisation ?
Urvish,
I think your need for optimization is resulting from the size of your file. I've been using SAS for more than 37 years, but don't have an answer for everything yet. However, I'm intrigued by your question, thus am going to repost it in another thread here, as well as on SAS-L.
One can easily create a file that only contains the variables you are sorting on, number the records, sort the file based on those two variables, and then resort the file back to its original order.
The time for doing such a task is extremely less than the time required to sort the entire file.
However, what I don't know, is whether one can then use the new file to establish an index for the original file.
I'll let you know what I find out.
I don't have an answer yet, but have been offered one option that might be a lead: the tagsort option on proc sort. I'm skeptical as the documentation states that it might INCREASE processing time, but the description is doing what I was suggesting. I don't have time to test it, but it might be worth a try.
A good place to start when investigating such problems is the SAS Support site. Search using the key words PROC SORT PERFORMANCE and you will get many useful links.
If you are using SAS under Windows this might be helpful:
Thanks, but I had already looked there. Using tagsort, on my system at least, will reduce the cpu time from 2.6 seconds, to about 1.6 seconds. However, only using the required variables, reduced the time to about 0.2 seconds. What I'm looking for is a way to capitalize on that.
Another alternate method is Hash Table. But I am not sure how fast it will be.
Thanks all
And i'll try options suggested by you all
Regards
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.