BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
molla
Fluorite | Level 6

Hi,

How can we sort a data set with out using proc sort ..

 

for example:

 

id   name

2     aaa

1     bbb

5     ccc

4     ddd

3     eee

 

output:

 

1   bbb

2   aaa

3   eee

4   ddd

5    ccc

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

The following takes six times longer than using proc sort, but includes some interesting concepts well worth while learning. Any file, array, or set of arrays can be sorted in either ascending or descending order. @SAShole's macro can do either. It wasn't designed to sort files but, with a slight modification, can easily do that as well. I use it when I need to sort an array but, simultaneously, change the order of a 2nd, 3rd or whatever array. Unfortunately, call sortc and call sortn can't do that!:

 

data have (drop=i);
  input id $ name $ other_stuff $;
  datalines;
b2     aaa xxx
b1     bbb yyy
b5     ccc zzz
b4     ddd aaa
b3     eee bbb
;
run;

/*==================================================================+
 | Name:    Qsort.                                                   |
 |===================================================================|
 | Developer: Paul M. Dorfman.                                       |
 |===================================================================|
 | Function: Sort array in place via the QuickSort algorithm.        |
 |===================================================================|
 | General:                                                          |
 |-------------------------------------------------------------------|
 | Qsort is a macro call routine designed to be invoked from within  |
 | a DATA step to sort a single-subscripted array or a parallel list |
 | of N arrays using one array as a key. The array(s) can be arrays  |
 | SAS variables or a temporary array; numeric or character. Arrays  |
 | in the list do not have to be the same type and/or item length.   |
 |                                                                   |
 | The kernel of the routine is a DATA step implementation of the    |
 | QuickSort algorithm (C.A.R. Hoare, 1960). Final ordering is done  |
 | by a single sweep of the modified insertion sort after quicksort  |
 | has reduced all array subpartitions to no more than &M items.     |
 |-------------------------------------------------------------------|
 | Performance:                                                      |
 |-------------------------------------------------------------------|
 | Run-time increases as N*log2(N), where N is the number of array   |
 | elements to be sorted. As a benchmark, a random numeric temporary |
 | array is sorted into ascending order in lt 1 CPU sec under OS/390 |
 | IBM model 9672 R36 running SAS version 8.0.                       |
 |===================================================================|
 | Usage examples:                                                   |
 |===================================================================|
 | 1. Sorting entire array A ascending (semicolon is optional):      |
 |                                                                   |
 |    %Qsort (Arr=A, Seq=a);                                         |
 |                                                                   |
 |    or simply, using Seq=a default,                                |
 |                                                                   |
 |    %Qsort (Arr=A);                                                |
 |-------------------------------------------------------------------|
 | 2. Sorting entire array B descending:                             |
 |                                                                   |
 |    %Qsort (Arr=B, Seq=d);                                         |
 |-------------------------------------------------------------------|
 | 3. Sorting elements -12345 through 12345 of array Z ascending,    |
 |    leaving the rest of the items intact;                          |
 |                                                                   |
 |    %Qsort (Arr=Z, first=-12345, last=12345);                      |
 |-------------------------------------------------------------------|
 | 4. Sorting first 100 elements of array C descending, the rest -   |
 |    ascending:                                                     |
 |                                                                   |
 |    %Qsort (Arr=C, Seq=d, last= 100);                              |
 |    %Qsort (Arr=C, Seq=a, first=101);                              |
 |-------------------------------------------------------------------|
 | 5. Sorting first half of array D ascending, second half -         |
 |    descending:                                                    |
 |                                                                   |
 |    half = int((lbound(D)+hbound(D))*.5);                          |
 |    %Qsort (Arr=D, Seq=D, last=half   );                           |
 |    %Qsort (Arr=D, Seq=A, first=half+1);                           |
 |                                                                   |
 |    or without HALF and omitting Seq=A by default:                 |
 |                                                                   |
 |    %Qsort (Arr=D, Seq=D, last= (lbound(D)+hbound(D))*.5  );       |
 |    %Qsort (Arr=D,        first=(lbound(D)+hbound(D))*.5+1);       |
 |-------------------------------------------------------------------|
 | 6. Doing the same as in 5 without the auxiliary variable HALF     |
 |    and omitting Seq=A by default:                                 |
 |                                                                   |
 |    %Qsort (Arr=D, Seq=D, last= (lbound(D)+hbound(D))*.5  );       |
 |    %Qsort (Arr=D,        first=(lbound(D)+hbound(D))*.5+1);       |
 |-------------------------------------------------------------------|
 | 7. Sorting parallel arrays A B C using array B as the key array:  |
 |                                                                   |
 |    %Qsort (Arr=A B C, By=C) ;                                     |
 |                                                                   |
 |    Illustration. The following step uses array C as a key:        |
 |    ------------                                                   |
 |    data _null_;                                                   |
 |       array a(10)    ( 9   8   7   6   5   4   3   2   1   0 );   |
 |       array b(10) $  ('a' 'b' 'c' 'd' 'e' 'f' 'g' 'h' 'i' 'j');   |
 |       array c(10)    ( 0   1   2   3   4   5   6   7   8   9 );   |
 |      %Qsort (Arr=A B C, By=C, Seq=D);                             |
 |       put A(*) / B(*) / C(*);                                     |
 |    run;                                                           |
 |                                                                   |
 |    It prints the following in the log:                            |
 |                                                                   |
 |    0 1 2 3 4 5 6 7 8 9                                            |
 |    j i h g f e d c b a                                            |
 |    9 8 7 6 5 4 3 2 1 0                                            |
 |                                                                   |
 |    Note: QuickSort is an unstable algorithm, i.e. if the key array|
 |    ----------------------------------------                       |
 |    has ties (duplicate keys), subordinate array elements DO NOT   |
 |                               ---------------------------------   |
 |    retain their original relative order. This behavior is similar |
 |    ------------------------------------                           |
 |    to that of Proc Sort with NOEQUALS option.                     |
 |    The parallel array feature is most useful when it is necessary |
 |    to perform and indirect sort, i.e. with a single parallel array|
 |    containing pointers to long 'records'. For instance, instead of|
 |    giving Qsort the labor of ordering 20 parallel arrays, it is   |
 |    more efficient to store the pointers of the arrays in a single |
 |    parallel array, and then finish the entire thing using a sweep |
 |    of indirect sorting.                                           |
 |                                                                   |
 |    Note: When using LB= and HB= options with parallel arrays, ALL |
 |    ----------------------------------------                       |
 |    lower array bounds must be GE LB=value. Likewise, the upper    |
 |    bounds must all be LE than HB= value. As long as the domain of |
 |    ALL indices lies within an LB= and HB, the indexing of arrays  |
 |    on the list can be arbitrary (in particular, negative).        |
 |===================================================================|
 | Arguments:                                                        |
 |-------------------------------------------------------------------|
 | Parameter  Usage     Description                                  |
 |-------------------------------------------------------------------|
 | Arr=       Required  The name of the array (or list of any number |
 |                      of parallel arrays) to be sorted. By default,|
 |                      the first array in the list becomes the key  |
 |                      array, and the rest of the arrays are permut-|
 |                      ed accordingly.                              |
 |-------------------------------------------------------------------|
 | By=        Optional  The name of the key array in the list. If the|
 |                      list consists of a single array, it is the   |
 |                      key array itself - along with the rule above.|
 |                      See usage for more details.                  |
 |-------------------------------------------------------------------|
 | Seq=       Optional  Sorting sequence. Ascending is default. To   |
 |                      specify it explicitly, set Seq=A (any case). |
 |                      Anything else will result in the array sorted|
 |                      decsending.                                  |
 |-------------------------------------------------------------------|
 | LB=        Optional  The indices of the first and last array      |
 | HB=        Numeric   elements to be included into sorting. You    |
 |                      can specify any valid numeric SAS expression,|
 |                      hardcoded numeric value, or a macrovariable  |
 |                      reference resolving to any of the above. The |
 |                      values of these parms default to the lower   |
 |                      and upper bounds.                            |
 |                      Use LB= and HB= parameters if you need only  |
 |                      part of the array to be ordered, or if you   |
 |                      want different parts ordered differently,    |
 |                      which can be  achieved by issuing two or more|
 |                      consecutive calls to Qsort with LB= and HB=  |
 |                      specified accordingly (see Usage above).     |
 |-------------------------------------------------------------------|
 | M=         Optional  Tuning parm: The largest subpartition size   |
 |            Numeric   Quicksort attempts to partition further. Any |
 |                      subpartition LE &M is passed to the straight |
 |                      insertion sort. &M=1 corresponds to 'pure'   |
 |                      Quicksort working until all subpartitions    |
 |                      have been reduced to just 1 element. &M=9 is |
 |                      optimal. Variations from 5 to 15 affect the  |
 |                      sorting speed very slightly. Best advice as  |
 |                      to M= : Leave it alone at 9.                 |
 +===================================================================*/
 %Macro Qsort (
Arr = /* Parallel array name list */
,By = %QScan(&Arr,1,%Str( )) /* Key array name */
,Seq = A /* Seq=D for descending */
,LB = Lbound(&By) /* Lower bound to sort */
,HB = Hbound(&By) /* Upper bound to sort */
,M = 9 /* Tuning range: (1:15) */
);
%Local _ H I J L N P Q S T W ; %Macro Swap (I,J) ; %Local W ; Do ; %Do W = 1 %To &N ; &T&W = &&A&W(&I) ; &&A&W(&I) = &&A&W(&J) ; &&A&W(&J) = &T&W ; %End ; End ; %Mend Swap ; %If %Upcase(&Seq) = %Upcase(A) %Then %Let Q = G ; %Else %Let Q = L ; %Do %Until (&&A&N EQ ) ; %Let N = %Eval(&N + 1) ; %Local A&N ; %Let A&N = %Scan(&Arr,&N,%Str( )) ; %End ; %Let N = %Eval(&N - 1) ; %Let _ = %Substr(%Sysfunc(Ranuni(0)),3, %Eval(7 - %Length(&N) + 5*(%Substr(&Sysver,1,1) GT 6))) ; %Let H = H&_ ; %Let I = I&_ ; %Let J = J&_ ; %Let L = L&_ ; %Let P = P&_ ; %Let S = S&_ ; %Let T = T&_ ; %Let Z = Z&_ ; Array &Z (0:1, 0:50) _Temporary_ ; &L = &LB ; &H = &HB ; If &H - &L GT &M Then Do &S=1 By 0 While (&S) ; &J = (&H - &L)/3 ; &I = &L + &J ; &J = &I + &J ; If &By(&L) &Q.T &By(&I) Then %Swap(&L,&I) ; If &By(&I) &Q.T &By(&J) Then %Swap(&I,&J) ; If &By(&J) &Q.T &By(&H) Then %Swap(&J,&H) ; If &By(&L) &Q.T &By(&I) Then %Swap(&L,&I) ; If &By(&I) &Q.T &By(&J) Then %Swap(&I,&J) ; If &By(&L) &Q.T &By(&I) Then %Swap(&L,&I) ; %If &M LE 3 %Then %Do ; If &H - &L LE 3 Then Do ; &L = &Z(0,&S) ; &H = &Z(1,&S) ; &S +- 1 ; Continue ; End ; %End ; %Swap(&L,&I) ; &P = &By(&L) ; &I = &L ; Do &J=&H + 1 By 0 ; Do &I=&I + 1 By + 1 Until(&By(&I) &Q.E &P) ; End ; Do &J=&J - 1 By - 1 Until(&P &Q.E &By(&J)) ; End ; If &I GE &J Then Leave ; %Swap(&I,&J) ; End ; %Swap(&L,&J) ; If &H - &J GE &J - &L GT &M Then Do &S = &S + 1 ; &Z(0,&S) = &J + 1 ; &Z(1,&S) = &H ; &H = &J - 1 ; End ; Else If &J - &L GE &H - &J GT &M Then Do &S = &S + 1 ; &Z(0,&S) = &L ; &Z(1,&S) = &J - 1 ; &L = &J + 1 ; End ; Else If &J - &L GT &M GE &H - &J Then &H = &J - 1 ; Else If &H - &J GT &M GE &J - &L Then &L = &J + 1 ; Else Do ; &L = &Z(0,&S) ; &H = &Z(1,&S) ; &S +- 1 ; End ; End ; %If &M = 1 %Then %Goto Exit ; Do &J = &LB + 1 To &HB ; If &By(&J - 1) &Q.T &By(&J) Then Do ; &P = &By(&J) ; %Do W = 1 %To &N ; %If &&A&W Ne &By %Then &T&W = &&A&W(&J) ; ; %End ; Do &I = &J - 1 To &LB By - 1 ; If &P &Q.E &By(&I) Then Leave ; %Do W = 1 %To &N ; &&A&W(&I + 1) = &&A&W(&I) ; %End ; End ; &By(&I + 1) = &P ; %Do W = 1 %To &N ; %If &&A&W Ne &By %Then &&A&W(&I + 1) = &T&W ; ; %End ; End ; End ; %Exit: Drop &H &I &J &L &P &S T&_: ; %Mend Qsort ; /* data have; input v1-v11; cards4; 0.004159 0.033589 0 0.277991 0.277991 0.024312 0 0.004479 0 0.050864 0.002239 0.047345 0 0.098209 0.119962 0.164107 0.004159 0 0.004159 0.002239 0.004798 0.028791 0 0.071337 0.12476 0.040627 0.089891 0.016955 0 0.042226 0 0 0.022713 0.016635 0.147473 0.06238 0.051184 0.022393 0.206654 0.002559 0.004479 0.038708 0.121561 0 0.277351 0 0.158989 0 0 0 0.056942 0.21849 0 0.011836 0.005758 0.038708 0 0.119642 0 0.012476 0.039987 0.004479 0.00064 0.150992 0.117722 0.050544 0.00032 0.050224 0 0 0 0.117083 0.135956 0.130838 0.241523 0.03199 0.022073 0.00128 0.018874 0.029111 0.023992 0.088612 0 0.105886 0.045425 0.109085 0 0 0.115483 0 0.024312 0.004159 0 0.089571 0.021433 0 0.024952 0.18682 0.026871 0.003519 0 0 0.00096 0.34293 0.129559 0 0 0 0 0.003839 ;;;; run; data want (keep=v1-v4 name1-name4); set have; array A(11) v1-v11; array N(11) $ name1-name11; do i=1 to dim(A); N(i)=vname(A(i)); end; %Qsort (Arr=A N, By=A, Seq=D); run; */ data _null_; call symput('nobs',nobs); stop; set have nobs=nobs; run; data want (drop=i); array point_var(&nobs.) _temporary_; array id_var(&nobs.) $ _temporary_; i=0; do until (eof); set have end = eof nobs=n; i+1; id_var(i)=id; point_var(i)=i; end; %Qsort (Arr=point_var id_var, By=id_var, Seq=A); do i=1 to n; x=point_var(i); set have point=x; output; end; run;

Art, CEO, AnalystFinder.com

 

View solution in original post

14 REPLIES 14
Kurt_Bremser
Super User

Use proc sql, but technically it does the same.

Or you could use a hash object in a data step, but that is much more complicated than typing 3 lines of proc sort.

Proc sort is the best, performancewise. And it scales fine until you run out of WORK (or UTILLOC) space.

PeterClemmensen
Tourmaline | Level 20

I recommend proc sort, but if this is out of the question, you can use proc sql

 

data have;
input id name $;
datalines;
2 aaa
1 bbb
5 ccc
4 ddd
3 eee
;

proc sql;
   create table want as
   select * 
   from have
   order by id;
quit;
molla
Fluorite | Level 6

I need to sort the data in datastep without using procedures..

molla
Fluorite | Level 6
I need to sort without using procedures
PeterClemmensen
Tourmaline | Level 20

This will do the trick, though proc sort is probably still a more efficient and simpler choice

 

data _null_;
   if 0 then set have(keep=id name);
   declare hash sorthave (dataset:'have', ordered:'Y');
      sorthave.definekey ('id');
      sorthave.definedata ('id','name');
      sorthave.definedone ();
   sorthave.output(dataset:'want');
stop;
run;

  

ChrisBrooks
Ammonite | Level 13

Here's a paper showing how to do a bubble sort http://support.sas.com/resources/papers/proceedings14/2025-2014.pdf but I can't help but be curious as to why you can't use ANY Procs......

ballardw
Super User

I have to say that one of the things I liked about SAS when I first started was not having to write my sort routines from scratch as I had in FORTRAN. Someone would have to point to a very overriding concern to avoid proc sort. Or perhaps you are asking how to use an external (i.e. non-SAS) host sort utility program?

Kurt_Bremser
Super User

@molla wrote:
I need to sort without using procedures

This is one of the most stupid things I ever read on the internet. Have the person that gave you this requirement visit a doctor. And don't smoke the same stuff, it seems to be contaminated.

Patrick
Opal | Level 21

@molla "I need to sort without using procedures"

 

Why?

molla
Fluorite | Level 6

I was asked in a interview to sort in data step

Patrick
Opal | Level 21

Then I hope that your interviewer was after the hash solution @PeterClemmensen proposed or after some answer which just demonstrates your understanding of the data step as such.

Ksharp
Super User
Are you sure your boss want a data analysis guy NOT a computer guy ?



data have;
input id   name $;
cards;
2     aaa
1     bbb
5     ccc
4     ddd
3     eee
;
run;
data want;
 set have end=last;
 array x{9999} _temporary_;
 array y{9999} $ 40 _temporary_;
 x{_n_}=id;
 y{_n_}=name;
 if last then do;
  do i=1 to _n_-1;
   do j=i+1 to _n_;
    if x{i} gt x{j} then do;
      t1=x{i};
	  x{i}=x{j};
	  x{j}=t1;

	  t2=y{i};
	  y{i}=y{j};
	  y{j}=t2;
     end;
   end;
  end;

  do k=1 to _n_;
   id=x{k};
   name=y{k};
   output;
  end;
 end;
 keep id name;
run;
KachiM
Rhodochrosite | Level 12

Here is sortless Sorting using Key Indexing.

 

[1] If id is a number and that maximum if id is 5 then the following will work:

data have;
input id name $;
datalines;
2     aaa
1     bbb
5     ccc
4     ddd
3     eee
;
run;

data want;
   array k[5] $8 _temporary_;
   if _n_ = 1 then do until(eof);
      set have end = eof;
      k[id] = name;
   end;
   do i = 1 to dim(k);
      id = i;
      name = k[i];
      output;
   end;
stop;
drop i;
run;

proc print data = want;
run;

If ID is a digital string the following can be done while creating HAVE using:

data have(drop = id rename = (iid = id));
input id $ name $;
iid = input(id, 8.);
datalines;
2     aaa
1     bbb
5     ccc
4     ddd
3     eee
;
run;

If the data set, HAVE is very large, the minimum and maximum of ID can't be found, then use the following program to determine them

and the use the program that follows:

data _null_;
   min = constant('BIG');
   max = 0;
   do until(eof);
      set have end = eof;
      if id < min then min = id;
      if id > max then max = id;
   end;
   call symputx('min', min);
   call symputx('max', max);
stop;
run;

%put &min;
%put &max;

data want;
   array k[&min:&max] $8 _temporary_;
   if _n_ = 1 then do until(eof);
      set have end = eof;
      k[id] = name;
   end;
   do i = lbound(k) to hbound(k);
      id = i;
      name = k[i];
      output;
   end;
stop;
drop i;
run;

 

art297
Opal | Level 21

The following takes six times longer than using proc sort, but includes some interesting concepts well worth while learning. Any file, array, or set of arrays can be sorted in either ascending or descending order. @SAShole's macro can do either. It wasn't designed to sort files but, with a slight modification, can easily do that as well. I use it when I need to sort an array but, simultaneously, change the order of a 2nd, 3rd or whatever array. Unfortunately, call sortc and call sortn can't do that!:

 

data have (drop=i);
  input id $ name $ other_stuff $;
  datalines;
b2     aaa xxx
b1     bbb yyy
b5     ccc zzz
b4     ddd aaa
b3     eee bbb
;
run;

/*==================================================================+
 | Name:    Qsort.                                                   |
 |===================================================================|
 | Developer: Paul M. Dorfman.                                       |
 |===================================================================|
 | Function: Sort array in place via the QuickSort algorithm.        |
 |===================================================================|
 | General:                                                          |
 |-------------------------------------------------------------------|
 | Qsort is a macro call routine designed to be invoked from within  |
 | a DATA step to sort a single-subscripted array or a parallel list |
 | of N arrays using one array as a key. The array(s) can be arrays  |
 | SAS variables or a temporary array; numeric or character. Arrays  |
 | in the list do not have to be the same type and/or item length.   |
 |                                                                   |
 | The kernel of the routine is a DATA step implementation of the    |
 | QuickSort algorithm (C.A.R. Hoare, 1960). Final ordering is done  |
 | by a single sweep of the modified insertion sort after quicksort  |
 | has reduced all array subpartitions to no more than &M items.     |
 |-------------------------------------------------------------------|
 | Performance:                                                      |
 |-------------------------------------------------------------------|
 | Run-time increases as N*log2(N), where N is the number of array   |
 | elements to be sorted. As a benchmark, a random numeric temporary |
 | array is sorted into ascending order in lt 1 CPU sec under OS/390 |
 | IBM model 9672 R36 running SAS version 8.0.                       |
 |===================================================================|
 | Usage examples:                                                   |
 |===================================================================|
 | 1. Sorting entire array A ascending (semicolon is optional):      |
 |                                                                   |
 |    %Qsort (Arr=A, Seq=a);                                         |
 |                                                                   |
 |    or simply, using Seq=a default,                                |
 |                                                                   |
 |    %Qsort (Arr=A);                                                |
 |-------------------------------------------------------------------|
 | 2. Sorting entire array B descending:                             |
 |                                                                   |
 |    %Qsort (Arr=B, Seq=d);                                         |
 |-------------------------------------------------------------------|
 | 3. Sorting elements -12345 through 12345 of array Z ascending,    |
 |    leaving the rest of the items intact;                          |
 |                                                                   |
 |    %Qsort (Arr=Z, first=-12345, last=12345);                      |
 |-------------------------------------------------------------------|
 | 4. Sorting first 100 elements of array C descending, the rest -   |
 |    ascending:                                                     |
 |                                                                   |
 |    %Qsort (Arr=C, Seq=d, last= 100);                              |
 |    %Qsort (Arr=C, Seq=a, first=101);                              |
 |-------------------------------------------------------------------|
 | 5. Sorting first half of array D ascending, second half -         |
 |    descending:                                                    |
 |                                                                   |
 |    half = int((lbound(D)+hbound(D))*.5);                          |
 |    %Qsort (Arr=D, Seq=D, last=half   );                           |
 |    %Qsort (Arr=D, Seq=A, first=half+1);                           |
 |                                                                   |
 |    or without HALF and omitting Seq=A by default:                 |
 |                                                                   |
 |    %Qsort (Arr=D, Seq=D, last= (lbound(D)+hbound(D))*.5  );       |
 |    %Qsort (Arr=D,        first=(lbound(D)+hbound(D))*.5+1);       |
 |-------------------------------------------------------------------|
 | 6. Doing the same as in 5 without the auxiliary variable HALF     |
 |    and omitting Seq=A by default:                                 |
 |                                                                   |
 |    %Qsort (Arr=D, Seq=D, last= (lbound(D)+hbound(D))*.5  );       |
 |    %Qsort (Arr=D,        first=(lbound(D)+hbound(D))*.5+1);       |
 |-------------------------------------------------------------------|
 | 7. Sorting parallel arrays A B C using array B as the key array:  |
 |                                                                   |
 |    %Qsort (Arr=A B C, By=C) ;                                     |
 |                                                                   |
 |    Illustration. The following step uses array C as a key:        |
 |    ------------                                                   |
 |    data _null_;                                                   |
 |       array a(10)    ( 9   8   7   6   5   4   3   2   1   0 );   |
 |       array b(10) $  ('a' 'b' 'c' 'd' 'e' 'f' 'g' 'h' 'i' 'j');   |
 |       array c(10)    ( 0   1   2   3   4   5   6   7   8   9 );   |
 |      %Qsort (Arr=A B C, By=C, Seq=D);                             |
 |       put A(*) / B(*) / C(*);                                     |
 |    run;                                                           |
 |                                                                   |
 |    It prints the following in the log:                            |
 |                                                                   |
 |    0 1 2 3 4 5 6 7 8 9                                            |
 |    j i h g f e d c b a                                            |
 |    9 8 7 6 5 4 3 2 1 0                                            |
 |                                                                   |
 |    Note: QuickSort is an unstable algorithm, i.e. if the key array|
 |    ----------------------------------------                       |
 |    has ties (duplicate keys), subordinate array elements DO NOT   |
 |                               ---------------------------------   |
 |    retain their original relative order. This behavior is similar |
 |    ------------------------------------                           |
 |    to that of Proc Sort with NOEQUALS option.                     |
 |    The parallel array feature is most useful when it is necessary |
 |    to perform and indirect sort, i.e. with a single parallel array|
 |    containing pointers to long 'records'. For instance, instead of|
 |    giving Qsort the labor of ordering 20 parallel arrays, it is   |
 |    more efficient to store the pointers of the arrays in a single |
 |    parallel array, and then finish the entire thing using a sweep |
 |    of indirect sorting.                                           |
 |                                                                   |
 |    Note: When using LB= and HB= options with parallel arrays, ALL |
 |    ----------------------------------------                       |
 |    lower array bounds must be GE LB=value. Likewise, the upper    |
 |    bounds must all be LE than HB= value. As long as the domain of |
 |    ALL indices lies within an LB= and HB, the indexing of arrays  |
 |    on the list can be arbitrary (in particular, negative).        |
 |===================================================================|
 | Arguments:                                                        |
 |-------------------------------------------------------------------|
 | Parameter  Usage     Description                                  |
 |-------------------------------------------------------------------|
 | Arr=       Required  The name of the array (or list of any number |
 |                      of parallel arrays) to be sorted. By default,|
 |                      the first array in the list becomes the key  |
 |                      array, and the rest of the arrays are permut-|
 |                      ed accordingly.                              |
 |-------------------------------------------------------------------|
 | By=        Optional  The name of the key array in the list. If the|
 |                      list consists of a single array, it is the   |
 |                      key array itself - along with the rule above.|
 |                      See usage for more details.                  |
 |-------------------------------------------------------------------|
 | Seq=       Optional  Sorting sequence. Ascending is default. To   |
 |                      specify it explicitly, set Seq=A (any case). |
 |                      Anything else will result in the array sorted|
 |                      decsending.                                  |
 |-------------------------------------------------------------------|
 | LB=        Optional  The indices of the first and last array      |
 | HB=        Numeric   elements to be included into sorting. You    |
 |                      can specify any valid numeric SAS expression,|
 |                      hardcoded numeric value, or a macrovariable  |
 |                      reference resolving to any of the above. The |
 |                      values of these parms default to the lower   |
 |                      and upper bounds.                            |
 |                      Use LB= and HB= parameters if you need only  |
 |                      part of the array to be ordered, or if you   |
 |                      want different parts ordered differently,    |
 |                      which can be  achieved by issuing two or more|
 |                      consecutive calls to Qsort with LB= and HB=  |
 |                      specified accordingly (see Usage above).     |
 |-------------------------------------------------------------------|
 | M=         Optional  Tuning parm: The largest subpartition size   |
 |            Numeric   Quicksort attempts to partition further. Any |
 |                      subpartition LE &M is passed to the straight |
 |                      insertion sort. &M=1 corresponds to 'pure'   |
 |                      Quicksort working until all subpartitions    |
 |                      have been reduced to just 1 element. &M=9 is |
 |                      optimal. Variations from 5 to 15 affect the  |
 |                      sorting speed very slightly. Best advice as  |
 |                      to M= : Leave it alone at 9.                 |
 +===================================================================*/
 %Macro Qsort (
Arr = /* Parallel array name list */
,By = %QScan(&Arr,1,%Str( )) /* Key array name */
,Seq = A /* Seq=D for descending */
,LB = Lbound(&By) /* Lower bound to sort */
,HB = Hbound(&By) /* Upper bound to sort */
,M = 9 /* Tuning range: (1:15) */
);
%Local _ H I J L N P Q S T W ; %Macro Swap (I,J) ; %Local W ; Do ; %Do W = 1 %To &N ; &T&W = &&A&W(&I) ; &&A&W(&I) = &&A&W(&J) ; &&A&W(&J) = &T&W ; %End ; End ; %Mend Swap ; %If %Upcase(&Seq) = %Upcase(A) %Then %Let Q = G ; %Else %Let Q = L ; %Do %Until (&&A&N EQ ) ; %Let N = %Eval(&N + 1) ; %Local A&N ; %Let A&N = %Scan(&Arr,&N,%Str( )) ; %End ; %Let N = %Eval(&N - 1) ; %Let _ = %Substr(%Sysfunc(Ranuni(0)),3, %Eval(7 - %Length(&N) + 5*(%Substr(&Sysver,1,1) GT 6))) ; %Let H = H&_ ; %Let I = I&_ ; %Let J = J&_ ; %Let L = L&_ ; %Let P = P&_ ; %Let S = S&_ ; %Let T = T&_ ; %Let Z = Z&_ ; Array &Z (0:1, 0:50) _Temporary_ ; &L = &LB ; &H = &HB ; If &H - &L GT &M Then Do &S=1 By 0 While (&S) ; &J = (&H - &L)/3 ; &I = &L + &J ; &J = &I + &J ; If &By(&L) &Q.T &By(&I) Then %Swap(&L,&I) ; If &By(&I) &Q.T &By(&J) Then %Swap(&I,&J) ; If &By(&J) &Q.T &By(&H) Then %Swap(&J,&H) ; If &By(&L) &Q.T &By(&I) Then %Swap(&L,&I) ; If &By(&I) &Q.T &By(&J) Then %Swap(&I,&J) ; If &By(&L) &Q.T &By(&I) Then %Swap(&L,&I) ; %If &M LE 3 %Then %Do ; If &H - &L LE 3 Then Do ; &L = &Z(0,&S) ; &H = &Z(1,&S) ; &S +- 1 ; Continue ; End ; %End ; %Swap(&L,&I) ; &P = &By(&L) ; &I = &L ; Do &J=&H + 1 By 0 ; Do &I=&I + 1 By + 1 Until(&By(&I) &Q.E &P) ; End ; Do &J=&J - 1 By - 1 Until(&P &Q.E &By(&J)) ; End ; If &I GE &J Then Leave ; %Swap(&I,&J) ; End ; %Swap(&L,&J) ; If &H - &J GE &J - &L GT &M Then Do &S = &S + 1 ; &Z(0,&S) = &J + 1 ; &Z(1,&S) = &H ; &H = &J - 1 ; End ; Else If &J - &L GE &H - &J GT &M Then Do &S = &S + 1 ; &Z(0,&S) = &L ; &Z(1,&S) = &J - 1 ; &L = &J + 1 ; End ; Else If &J - &L GT &M GE &H - &J Then &H = &J - 1 ; Else If &H - &J GT &M GE &J - &L Then &L = &J + 1 ; Else Do ; &L = &Z(0,&S) ; &H = &Z(1,&S) ; &S +- 1 ; End ; End ; %If &M = 1 %Then %Goto Exit ; Do &J = &LB + 1 To &HB ; If &By(&J - 1) &Q.T &By(&J) Then Do ; &P = &By(&J) ; %Do W = 1 %To &N ; %If &&A&W Ne &By %Then &T&W = &&A&W(&J) ; ; %End ; Do &I = &J - 1 To &LB By - 1 ; If &P &Q.E &By(&I) Then Leave ; %Do W = 1 %To &N ; &&A&W(&I + 1) = &&A&W(&I) ; %End ; End ; &By(&I + 1) = &P ; %Do W = 1 %To &N ; %If &&A&W Ne &By %Then &&A&W(&I + 1) = &T&W ; ; %End ; End ; End ; %Exit: Drop &H &I &J &L &P &S T&_: ; %Mend Qsort ; /* data have; input v1-v11; cards4; 0.004159 0.033589 0 0.277991 0.277991 0.024312 0 0.004479 0 0.050864 0.002239 0.047345 0 0.098209 0.119962 0.164107 0.004159 0 0.004159 0.002239 0.004798 0.028791 0 0.071337 0.12476 0.040627 0.089891 0.016955 0 0.042226 0 0 0.022713 0.016635 0.147473 0.06238 0.051184 0.022393 0.206654 0.002559 0.004479 0.038708 0.121561 0 0.277351 0 0.158989 0 0 0 0.056942 0.21849 0 0.011836 0.005758 0.038708 0 0.119642 0 0.012476 0.039987 0.004479 0.00064 0.150992 0.117722 0.050544 0.00032 0.050224 0 0 0 0.117083 0.135956 0.130838 0.241523 0.03199 0.022073 0.00128 0.018874 0.029111 0.023992 0.088612 0 0.105886 0.045425 0.109085 0 0 0.115483 0 0.024312 0.004159 0 0.089571 0.021433 0 0.024952 0.18682 0.026871 0.003519 0 0 0.00096 0.34293 0.129559 0 0 0 0 0.003839 ;;;; run; data want (keep=v1-v4 name1-name4); set have; array A(11) v1-v11; array N(11) $ name1-name11; do i=1 to dim(A); N(i)=vname(A(i)); end; %Qsort (Arr=A N, By=A, Seq=D); run; */ data _null_; call symput('nobs',nobs); stop; set have nobs=nobs; run; data want (drop=i); array point_var(&nobs.) _temporary_; array id_var(&nobs.) $ _temporary_; i=0; do until (eof); set have end = eof nobs=n; i+1; id_var(i)=id; point_var(i)=i; end; %Qsort (Arr=point_var id_var, By=id_var, Seq=A); do i=1 to n; x=point_var(i); set have point=x; output; end; run;

Art, CEO, AnalystFinder.com

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 14 replies
  • 4668 views
  • 8 likes
  • 9 in conversation