BookmarkSubscribeRSS Feed
statadm
Fluorite | Level 6

I want to array a group of variables and output the 5 highest values into 5 new variables.  I know how to do this, but I don't know how to indicate which variable the highest values came from.  Does anyone know how this can be done?

Thanks in advance.

5 REPLIES 5
MikeZdeb
Rhodochrosite | Level 12

Hi ... not sure how you deal with TIES (multiple variables with same value within an observation), but if there are not any

duplicate large values within an observation, this should work ... 

* some data;

data x;

array x(20);

do j=1 to 10;

do k= 1 to 20;

   x(k) = ceil(10000*ranuni(999));

end;

output;

end;

keep x1-x20;

run;

* use WHICHN and VNAME to get the variable names ... put into variables z1-z5);

data y;

set x;

array x(20);

array y(5);

array z(5) $32;

do j = 1 to 5;

   y(j) = largest(j, of x:);

   var = whichn(y(j),of x:);

   z(j) = vname(x(var));

end;

keep y: z:;

run;

y1      y2      y3      y4      y5     z1     z2     z3     z4     z5

8968    7881    7777    7443    6810    x16    x19    x1     x9     x5

9990    9956    9568    9384    9257    x9     x3     x18    x4     x16

9297    8066    7714    7355    7072    x11    x9     x10    x18    x4

9969    9903    8598    8386    8184    x10    x7     x1     x20    x18

9947    9512    9398    8740    7714    x11    x17    x18    x1     x4

9609    9528    8581    7187    7028    x10    x4     x9     x5     x7

9550    9403    8798    8024    5612    x9     x6     x12    x17    x2

9727    9605    9173    8484    8360    x14    x18    x13    x3     x2

9279    9091    9010    8755    8178    x2     x10    x6     x14    x17

9898    8788    8753    8444    8188    x13    x20    x16    x19    x17

I guess if there were ties , you could resort to  transposing the data and using PROC RANK.  You could

end up with more than 5 largest for each observation (or do something more elaborate with the above ARRAY

based code).


MikeZdeb
Rhodochrosite | Level 12

Hi ... OK, upon a little more thought,  I think that this works with ties.  If there are multiple large values, this allows

up to 10 variables  (Y1-Y10) to hold the 5 largest values and variable names (Z1-Z10) ...

* some data ... restricted range of X1-X20 leads to some duplicate values within an observation;

data x;

array x(20);

do j=1 to 10;

do k= 1 to 20;

   x(k) = ceil(100*ranuni(999));

end;

output;

end;

keep x1-x20;

run;

data y;

set x;

array x(20);

array y(10);

array z(10) $32;

do j = 1 by 1 until(count eq 5 or j eq 10);

   y(j) = largest(j, of x:);

   var = whichn(y(j),of x:);

   z(j) = vname(x(var));

   if x(var) ne hold then do; count = sum(count,1); hold=x(var); x(var) = 1e6 ; end;

end;

keep y: z:;

run;



largest5.png
statadm
Fluorite | Level 6

Thank you so much, I will give this a try.

ballardw
Super User

Is there going to be any particular rule about choosing one variable over another in the case of tied values?

statadm
Fluorite | Level 6

There are no rules, but I know there will be no more than 5 values that can be the same.  I'm actually looking for the top 3 values, but extended it to 5 in order to pick up a 5 way tie.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 3974 views
  • 1 like
  • 3 in conversation