Help using Base SAS procedures

Question on Sorting

Reply
Occasional Contributor
Posts: 19

Question on Sorting

Say you have a data set with Vars A,B,C,D,E and it is proc sorted by A,B,C then you create subsets (new data sets) from that original data set but never change the original sorted data.  Later on I need my original data set to be sorted by A,B isn't it already sorted by A,B since I sorted by A,B,C originally?  I am taking over a piece of code at work at the data set takes a long time to sort (lots of data) and I noticed it is sorted twice and I was hoping to eliminate the second sort.  Thanks in advance. 

Respected Advisor
Posts: 3,799

Re: Question on Sorting

Yes if you sort by A B C it is by definition sorted by A B.  I find that most programmers SORT way too much.  It is important to remember that SORTED does not necessarily mean that data was created by PROC SORT.  For example PROCS that use CLASS statements created summary data that is sorted by the CLASS variables in most cases.  It is good to learn and exploit ways to created or maintain data in sorted order.

Frequent Contributor
Frequent Contributor
Posts: 94

Question on Sorting

Posted in reply to data_null__

The PRESORTED option added to proc sort in SAS 9.2 could help avoid the heavy sorting costs, but still allow you to be completely certain data is sorted as required.

Obviously in Null's examples these outputs are sorted by definition, but this could help especially if you're using data created outside your own program.

http://support.sas.com/documentation/cdl/en/proc/61895/HTML/default/viewer.htm#a000146878.htm

Respected Advisor
Posts: 3,799

Question on Sorting

PRESORTED would be a good place to start.  The OP could modify the existing program with the option on the SORTS and measure any performance difference.  It might be enough increase to be "good enough", without having to rewrite everything which would require a good bit of testing.  Depends on the app and level of risk, I reckon.

Super User
Posts: 10,044

Question on Sorting

If you want to avoid to sort a dataset more than once ,the best way is to create a index for the datasets,then you will never use proc sort to pre -order the dataset before by statement.

And it is a good way to use large table especially.

Ksharp

New Contributor
Posts: 4

Question on Sorting

As others have pointed out, it should already be sorted.

You can easily let SAS know that it's sorted by setting the sortedBy= data set option.  This can be useful when creating subsets of the parent data set that you ABSOLUTELY KNOW ARE STILL IN THE SORT ORDER.  Here is the syntax:

data junk (sortedBy= i x);
  do i = 1 to 10;
    x +1;
    output;
  end;
run;

proc contents data = junk;
run;

Skipping redundant sorts is usually the best way to gain effeciency in code, but do yourself a big favor and comment the hell out of it Smiley Happy

Good luck. -s www.sascoders.com

Ask a Question
Discussion stats
  • 5 replies
  • 137 views
  • 0 likes
  • 5 in conversation