BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
AndersS
Lapis Lazuli | Level 10

Hi! I need to sort rather large arrays in the Data step. All Numeric values, unique. No. of values up to 50 thousand, perhaps more.

Is Quicksort the best?

Where can I find the best SAS code for Quicksort?

What alternatives are there for Quicksort?

 

I use SAS ODA (OnDemand for Academics). I want to write code that can be published and is very general.

Many thanks in advance!
(I have been googling for a white. Not easy to find the best answer)

/Br AndersS

Anders Sköllermo (Skollermo in English)
1 ACCEPTED SOLUTION

Accepted Solutions
Cynthia_sas
SAS Super FREQ

Hi:

  My tendency would be to use CALL SORTN, which is designed for sorting numeric array members. I did find a reference to "QUICKSORT" in this older user group paper by Paul Dorfman https://support.sas.com/resources/papers/proceedings/proceedings/sugi26/p096-26.pdf , however, I believe the paper may have pre-dated the introduction of CALL SORTN.

  You may be limited by memory as to the size of the array, this previous forum thread discusses memory as a limiting factor in array size https://communities.sas.com/t5/SAS-Programming/what-s-the-limit-to-how-many-elements-variables-a-SAS... .

  I'm sure that others with more experience sorting arrays will have additional feedback.

Cynthia

 

View solution in original post

6 REPLIES 6
AndersS
Lapis Lazuli | Level 10

Hi! YES!   

p.s.

The limit in SAS ODA is around 250 million values.

Anders Sköllermo (Skollermo in English)
Cynthia_sas
SAS Super FREQ

Hi:

  My tendency would be to use CALL SORTN, which is designed for sorting numeric array members. I did find a reference to "QUICKSORT" in this older user group paper by Paul Dorfman https://support.sas.com/resources/papers/proceedings/proceedings/sugi26/p096-26.pdf , however, I believe the paper may have pre-dated the introduction of CALL SORTN.

  You may be limited by memory as to the size of the array, this previous forum thread discusses memory as a limiting factor in array size https://communities.sas.com/t5/SAS-Programming/what-s-the-limit-to-how-many-elements-variables-a-SAS... .

  I'm sure that others with more experience sorting arrays will have additional feedback.

Cynthia

 

AndersS
Lapis Lazuli | Level 10

Hi! I have made some tests on the Linux server for SAS ODA. 

    I used PROC NLIN – quadratic model.
    CPU in seconds for CALL SORTN.
    SIZE in millions of array elements.

(Good with all the examples in SAS Documentation. Just cut-and-paste)

 

Result:  An almost straight line.
Sorting methods are often linear (like BigOrdo (N*log (N))  for "small models"

and more quadratic (like BigOrdo(N*N))  for "large models". 

AndersS_0-1704212818873.png

 

Anders Sköllermo (Skollermo in English)
Tom
Super User Tom
Super User

Transpose the data and use PROC SORT.

mkeintz
PROC Star

Like @Cynthia_sas, I would use the CALL SORTN subroutine.  It works in a single data step, and sorting 50,000 numeric variables shouldn't add very much demand for memory.

 

And even if the quicksort algorithm were faster, you would need to include a lot more code (probably as a macro) than the single statement call sortn.  Given that your objective is to generate code that "can be published and is very general", you might be better off using the simplest SAS code possible.

 

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1482 views
  • 4 likes
  • 5 in conversation