I have a dataset looking like this:
data test;
array var(10) var1-var10;
cards;
1 2 3 4 5 4 3 2 1 1
. . . . . . . . . . .
;
run;
I want to find the maximum value in the array (in this case 5) and average the 3 values around it (the max value, max-1 and max+1 - in the example 4,5 and 4)
Thanks!
data want;
set test;
array v var1-var10;
maximum_value=max(of var1-var10);
loc=whichn(maximum_value,of var1-var10);
avg_of_3=mean(v(loc-1),v(loc),v(loc+1));
run;
Note: this code fails if the maximum of var1-var10 is in position 1 or position 10. If the maximum appears at two or more locations, it will find only the first such occurrence of the maximum and work with that.
data want;
set test;
array v var1-var10;
maximum_value=max(of var1-var10);
loc=whichn(maximum_value,of var1-var10);
avg_of_3=mean(v(loc-1),v(loc),v(loc+1));
run;
Note: this code fails if the maximum of var1-var10 is in position 1 or position 10. If the maximum appears at two or more locations, it will find only the first such occurrence of the maximum and work with that.
Thank you - works perfectly! Sorry about the code. Here is the complete code including yours:
data test;
input var1-var10;
cards;
1 2 3 4 5 4 3 2 1 1
1 3 4 5 6 3 5 2 2 1
;
run;
data want;
set test;
array v var1-var10;
maximum_value=max(of var1-var10);
loc=whichn(maximum_value,of var1-var10);
avg_of_3=mean(v(loc-1),v(loc),v(loc+1));
run;
You can fix the boundary condition by adding two extra variables (with missing values) to the ARRAY. You can use just one variable and add it twice to the array.
The result be the average of the TWO values at the boundary since the third value will be missing.
data want;
set test;
array v var0 var1-var10 var0;
maximum_value=max(of v[*]);
loc=whichn(maximum_value,of v[*]);
avg_of_3=mean(v(loc-1),v(loc),v(loc+1));
drop var0 ;
run;
@Tom wrote:
You can fix the boundary condition by adding two extra variables (with missing values) to the ARRAY. You can use just one variable and add it twice to the array.
The result be the average of the TWO values at the boundary since the third value will be missing.
True if that's what the OP wanted. However, no such statement from the OP has been made, so I didn't try to code anything for boundary conditions.
I did something similar (see code below) but your solution is more elegant. Thanks! I know I didn't ask for this, but just because I didn't think of that problem in the beginning. I appreciate all the help.
data want;
set test;
array v var1-var10;
maximum_value=max(of var1-var10);
loc=whichn(maximum_value,of var1-var10);
if loc ge 2 and loc le 9 then avg_of_3=mean(v(loc-1),v(loc),v(loc+1));
if loc=1 then avg_of_3=mean(v(loc),v(loc+1));
if loc=10 then avg_of_3=mean(v(loc),v(loc-1));
run;
Looks good to me! Good for you to make the fix by yourself.
Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.