I want to array a group of variables and output the 5 highest values into 5 new variables. I know how to do this, but I don't know how to indicate which variable the highest values came from. Does anyone know how this can be done?
Thanks in advance.
Hi ... not sure how you deal with TIES (multiple variables with same value within an observation), but if there are not any
duplicate large values within an observation, this should work ...
* some data;
data x;
array x(20);
do j=1 to 10;
do k= 1 to 20;
x(k) = ceil(10000*ranuni(999));
end;
output;
end;
keep x1-x20;
run;
* use WHICHN and VNAME to get the variable names ... put into variables z1-z5);
data y;
set x;
array x(20);
array y(5);
array z(5) $32;
do j = 1 to 5;
y(j) = largest(j, of x:);
var = whichn(y(j),of x:);
z(j) = vname(x(var));
end;
keep y: z:;
run;
y1 y2 y3 y4 y5 z1 z2 z3 z4 z5
8968 7881 7777 7443 6810 x16 x19 x1 x9 x5
9990 9956 9568 9384 9257 x9 x3 x18 x4 x16
9297 8066 7714 7355 7072 x11 x9 x10 x18 x4
9969 9903 8598 8386 8184 x10 x7 x1 x20 x18
9947 9512 9398 8740 7714 x11 x17 x18 x1 x4
9609 9528 8581 7187 7028 x10 x4 x9 x5 x7
9550 9403 8798 8024 5612 x9 x6 x12 x17 x2
9727 9605 9173 8484 8360 x14 x18 x13 x3 x2
9279 9091 9010 8755 8178 x2 x10 x6 x14 x17
9898 8788 8753 8444 8188 x13 x20 x16 x19 x17
I guess if there were ties , you could resort to transposing the data and using PROC RANK. You could
end up with more than 5 largest for each observation (or do something more elaborate with the above ARRAY
based code).
Hi ... OK, upon a little more thought, I think that this works with ties. If there are multiple large values, this allows
up to 10 variables (Y1-Y10) to hold the 5 largest values and variable names (Z1-Z10) ...
* some data ... restricted range of X1-X20 leads to some duplicate values within an observation;
data x;
array x(20);
do j=1 to 10;
do k= 1 to 20;
x(k) = ceil(100*ranuni(999));
end;
output;
end;
keep x1-x20;
run;
data y;
set x;
array x(20);
array y(10);
array z(10) $32;
do j = 1 by 1 until(count eq 5 or j eq 10);
y(j) = largest(j, of x:);
var = whichn(y(j),of x:);
z(j) = vname(x(var));
if x(var) ne hold then do; count = sum(count,1); hold=x(var); x(var) = 1e6 ; end;
end;
keep y: z:;
run;
Thank you so much, I will give this a try.
Is there going to be any particular rule about choosing one variable over another in the case of tied values?
There are no rules, but I know there will be no more than 5 values that can be the same. I'm actually looking for the top 3 values, but extended it to 5 in order to pick up a 5 way tie.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.