Hi SAS community,
I have the below dataset and need to calculate mean of all the variables possible with these 5 variables.
For ex: Mean(Ques fres tas mas des) and Mean(Ques fres tas mas) and (Ques fres tas )...
Is there a way to get the mean of all variations possible with these 5 variables.
data a;
input Ques fres tas mas des ;
cards;
2 1 4 2 3
5 1 5 5 7
3 3 4 6 8
4 2 4 4 8
9 5 1 2 3
8 2 4 5 3
4 3 3 2 7
;
run;
Let me know if this is possible .
Thank you for your time and effort.
Here is a simple way to get the means from all possible subsets (except the empty subset):
data a;
input Ques fres tas mas des;
cards;
2 1 4 2 3
5 1 5 5 7
3 3 4 6 8
4 2 4 4 8
9 5 1 2 3
8 2 4 5 3
4 3 3 2 7
;
/* Get the number of all possible subsets except the empty subset */
data _null_;
set a;
array x _numeric_;
nbSets = 2 ** dim(x) - 1;
call symputx("nbSets", nbSets);
stop;
run;
data want;
set a;
array x _numeric_;
array y {&nbSets.};
do s = 1 to &nbSets.;
ss = s; sum = 0; nb = 0;
do i = 1 by 1 until(ss = 0);
if mod(ss, 2) = 1 then do;
sum = sum + x{i};
nb = nb + 1;
end;
ss = int(ss/2);
end;
y{s} = sum / nb;
end;
drop s ss i sum nb;
run;
I wouldn't try to scale this method beyond n=12.
Look at the SAS Functions Call AllComb in the documentation.
You do realize this requires adding 26 variables to your data set? Hint: Comb function counts combinations of R elements from N choices. How are you going to keep track of which value is which set of variables?
Before going into details what is the utility of this?
@shasank wrote:
Thank you for your reply. Yes, I do realize that. The purpose of this is to
take a mean of all the combinations possible with these 5 variable and the
next step is to find the find the car combo with mean closest to a
different variable in another dataset.
So what happens when you have the same mean for more than one of these combinations?
If any of the variables are ever missing then the one of the choose 2 versions will always have the same mean as one of the single variable. More missing values per record means even more same mean values are likely.
Here is a simple way to get the means from all possible subsets (except the empty subset):
data a;
input Ques fres tas mas des;
cards;
2 1 4 2 3
5 1 5 5 7
3 3 4 6 8
4 2 4 4 8
9 5 1 2 3
8 2 4 5 3
4 3 3 2 7
;
/* Get the number of all possible subsets except the empty subset */
data _null_;
set a;
array x _numeric_;
nbSets = 2 ** dim(x) - 1;
call symputx("nbSets", nbSets);
stop;
run;
data want;
set a;
array x _numeric_;
array y {&nbSets.};
do s = 1 to &nbSets.;
ss = s; sum = 0; nb = 0;
do i = 1 by 1 until(ss = 0);
if mod(ss, 2) = 1 then do;
sum = sum + x{i};
nb = nb + 1;
end;
ss = int(ss/2);
end;
y{s} = sum / nb;
end;
drop s ss i sum nb;
run;
I wouldn't try to scale this method beyond n=12.
Get the combos with:
data combos;
set a;
array x _numeric_;
length combo $160;
do set = 1 to &nbSets.;
ss = set; combo = " ";
do i = 1 by 1 until(ss = 0);
if mod(ss, 2) = 1 then do;
combo = catx("-", combo, vname(x{i}));
end;
ss = int(ss/2);
end;
output;
end;
stop;
keep set combo;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.