BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
shasank
Quartz | Level 8

Hi SAS community, 

 

I have the below dataset and need to calculate mean of all the variables possible with these 5 variables. 

 

For ex: Mean(Ques fres tas mas des) and Mean(Ques fres tas mas) and (Ques fres tas )... 

Is there a way to get the mean of all variations possible with these 5 variables. 

 

data a;
input Ques fres tas mas des ;
cards;
2 1 4 2 3
5 1 5 5 7
3 3 4 6 8
4 2 4 4 8
9 5 1 2 3
8 2 4 5 3
4 3 3 2 7
;
run;

Let me know if this is possible . 

 

Thank you for your time and effort. 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

Here is a simple way to get the means from all possible subsets (except the empty subset):

 

data a;
input Ques fres tas mas des;
cards;
2 1 4 2 3
5 1 5 5 7
3 3 4 6 8
4 2 4 4 8
9 5 1 2 3
8 2 4 5 3
4 3 3 2 7
;

/* Get the number of all possible subsets except the empty subset */
data _null_;
set a;
array x _numeric_;
nbSets = 2 ** dim(x) - 1; 
call symputx("nbSets", nbSets);
stop;
run;

data want;
set a;
array x _numeric_;
array y {&nbSets.};
do s = 1 to &nbSets.;
    ss = s; sum = 0; nb = 0;
    do i = 1 by 1 until(ss = 0);
        if mod(ss, 2) = 1 then do;
            sum = sum + x{i};
            nb = nb + 1;
            end;
        ss = int(ss/2);
        end;
    y{s} = sum / nb;
    end;
drop s ss i sum nb;
run;

I wouldn't try to scale this method beyond n=12.

PG

View solution in original post

8 REPLIES 8
ballardw
Super User

Look at the SAS Functions Call AllComb in the documentation.

 

You do realize this requires adding 26 variables to your data set? Hint: Comb function counts combinations of R elements from N choices. How are you going to keep track of which value is which set of variables?

 

Before going into details what is the utility of this?

shasank
Quartz | Level 8
Thank you for your reply. Yes, I do realize that. The purpose of this is to
take a mean of all the combinations possible with these 5 variable and the
next step is to find the find the car combo with mean closest to a
different variable in another dataset.
ballardw
Super User

@shasank wrote:
Thank you for your reply. Yes, I do realize that. The purpose of this is to
take a mean of all the combinations possible with these 5 variable and the
next step is to find the find the car combo with mean closest to a
different variable in another dataset.

So what happens when you have the same mean for more than one of these combinations?

If any of the variables are ever missing then the one of the choose 2 versions will always have the same mean as one of the single variable. More missing values per record means even more same mean values are likely.

shasank
Quartz | Level 8
That's a great point. Thank you for the critical thinking. My actual data is in a format if 11.11111 and doesn't have any missing values. The reason for this request is to find what combo of variables did an equipment use to get to a final number. The machine always uses a set of variable and I am trying to find a trend of which is the most used combo.

PGStats
Opal | Level 21

Here is a simple way to get the means from all possible subsets (except the empty subset):

 

data a;
input Ques fres tas mas des;
cards;
2 1 4 2 3
5 1 5 5 7
3 3 4 6 8
4 2 4 4 8
9 5 1 2 3
8 2 4 5 3
4 3 3 2 7
;

/* Get the number of all possible subsets except the empty subset */
data _null_;
set a;
array x _numeric_;
nbSets = 2 ** dim(x) - 1; 
call symputx("nbSets", nbSets);
stop;
run;

data want;
set a;
array x _numeric_;
array y {&nbSets.};
do s = 1 to &nbSets.;
    ss = s; sum = 0; nb = 0;
    do i = 1 by 1 until(ss = 0);
        if mod(ss, 2) = 1 then do;
            sum = sum + x{i};
            nb = nb + 1;
            end;
        ss = int(ss/2);
        end;
    y{s} = sum / nb;
    end;
drop s ss i sum nb;
run;

I wouldn't try to scale this method beyond n=12.

PG
shasank
Quartz | Level 8
Thank you so so much for the solution. I am trying to modify this code to get the combination calculated as variable name. Is there a way to modify this code to get the combo variable used to get to the mean???
PGStats
Opal | Level 21

Get the combos with:

 

data combos;
set a;
array x _numeric_;
length combo $160;
do set = 1 to &nbSets.;
    ss = set; combo = " ";
    do i = 1 by 1 until(ss = 0);
        if mod(ss, 2) = 1 then do;
            combo = catx("-", combo, vname(x{i}));
            end;
        ss = int(ss/2);
        end;
    output;
    end;
stop;
keep set combo;
run;
PG
shasank
Quartz | Level 8
Sorry for not being clear.
y{s} = sum / nb;
Instead of y1,y2,y3 is there a possibility to have the combo as a variable name. This would help know which all were combined to get to the value in the column.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 1055 views
  • 0 likes
  • 3 in conversation