DATA Step, Macro, Functions and more

neat way to sum all Combinations

Accepted Solution Solved
Reply
Contributor
Posts: 31
Accepted Solution

neat way to sum all Combinations

Hello, 

 

If I have data with variable x1, x2, x3, and I need to find all the possible combination of 3 (there are 7 ).  Is there any elegant way to do this instead of list all the combination?  Because I might end up with 5 or 10 variables.

 

data test;

  input x1 x2 x3;

datalines;

1 2 3

3 4 5

;

run;

 

data all;

  set test;

 

/***is there any neat way to do this combination, instead of hard coded all the combinations?***/

 comb1 = x1;

comb2 = x2;

comb3 = x3;

comb12 = sum(x1, x2);

comb13 = sum(x1, x3);

comb23 = sum(x2, x3);

comb123 = sum(x1, x2, x3);

run;

 

thank you!


Accepted Solutions
Solution
‎06-17-2016 02:25 PM
SAS Super FREQ
Posts: 3,476

Re: neat way to sum all Combinations

[ Edited ]

Yes, you can do this, probably in several ways. The following generates all binary sequences between 1 and 2**3-1 = 7:

001

010

011

,,,

111

The sum that you want is coef[1]*x1 + coef[2]*x2 + coef[3]*x3, where coef[i] is the i_th element of the binary sequence (thought of as a number). You can use the SUBSTR function is used to extract the i_th coefficient.

 

 

data all;
set test;
array x[3] x1-x3;
array comb[7] c1-c7;
label c1="comb3" c2="comb2" c3="comb23"
      c4="comb1" c5="comb13" c6="comb12" c7="comb123";
n = dim(x);
do i = 1 to 2**n - 1;         /* count 1 to 2**n - 1 */
   s = put(i, Binary3.);      /* generate binary representation */
   sum = 0;
   do k = 1 to n;             /* sum = SUM of s[k]*x[k] */
      coef = substr(s,k,1);
      sum + coef*x[k];
   end;
   comb[i] = sum;             /* store sum for this combination of variables */
end;   
output;
keep x1-x3 c1-c7;
run;
proc print label; 
var x1-x3 c4 c2 c1 c6 c5 c3 c7;
run;

 

Notice that my algorithm does not generate the values in the same order as you did. (You computed sums of variables taken one at a time, taken two at a time, and taken three at a time.) If necessary, I'm sure others on the list can modify my idea to give the results in a different order.  To help compare my approach with your, I assigned labels to each variable that agree with your variable names.

 

The algorithm is explained further in the article "Creating a matrix with all combinations of zeros and ones."

View solution in original post


All Replies
Community Manager
Posts: 2,761

Re: neat way to sum all Combinations

Have you looked at the ALLCOMB function?  @Rick_SAS also has a nice series of articles about generating combinations in SAS.

Solution
‎06-17-2016 02:25 PM
SAS Super FREQ
Posts: 3,476

Re: neat way to sum all Combinations

[ Edited ]

Yes, you can do this, probably in several ways. The following generates all binary sequences between 1 and 2**3-1 = 7:

001

010

011

,,,

111

The sum that you want is coef[1]*x1 + coef[2]*x2 + coef[3]*x3, where coef[i] is the i_th element of the binary sequence (thought of as a number). You can use the SUBSTR function is used to extract the i_th coefficient.

 

 

data all;
set test;
array x[3] x1-x3;
array comb[7] c1-c7;
label c1="comb3" c2="comb2" c3="comb23"
      c4="comb1" c5="comb13" c6="comb12" c7="comb123";
n = dim(x);
do i = 1 to 2**n - 1;         /* count 1 to 2**n - 1 */
   s = put(i, Binary3.);      /* generate binary representation */
   sum = 0;
   do k = 1 to n;             /* sum = SUM of s[k]*x[k] */
      coef = substr(s,k,1);
      sum + coef*x[k];
   end;
   comb[i] = sum;             /* store sum for this combination of variables */
end;   
output;
keep x1-x3 c1-c7;
run;
proc print label; 
var x1-x3 c4 c2 c1 c6 c5 c3 c7;
run;

 

Notice that my algorithm does not generate the values in the same order as you did. (You computed sums of variables taken one at a time, taken two at a time, and taken three at a time.) If necessary, I'm sure others on the list can modify my idea to give the results in a different order.  To help compare my approach with your, I assigned labels to each variable that agree with your variable names.

 

The algorithm is explained further in the article "Creating a matrix with all combinations of zeros and ones."

Contributor
Posts: 31

Re: neat way to sum all Combinations

Thank you, RicK! 

Occasional Contributor jo1
Occasional Contributor
Posts: 14

Re: neat way to sum all Combinations

I know this post was some time ago, but I found in useful in working out how to do a similar thing but with more variables.  The thing I'd like to be able to do but I can't think how is to know which of the combinations were 1, 2, 3 etc factor combinations - any suggestions of where to start looking for an answer.

 

Jo

SAS Super FREQ
Posts: 3,476

Re: neat way to sum all Combinations

I assume you figured out how to adjust the dimensions of the arrays and the width of the BINARYw. format., The important change is to add S to the KEEP statement. The variable S is a binary string that indicates which variables are in the combination. Thus '10011' represents x1, x2, and x5.

Occasional Contributor jo1
Occasional Contributor
Posts: 14

Re: neat way to sum all Combinations

Thanks yes, after a bit I worked out to increase the dimensions of the arrays and adjusted the width of the BINARY. I did try adding s to the KEEP statement but I'm assuming it would have been just the last combination that it was showing.
But continuing that line of thought I created a new table with the number of combinations and their BINARYw number (only the 1's)
data new;
do i = 1 to 29;
s = compress(put(i,Binary5.),"0");
output;
I needed to transpose the data anyway so the Combinations (columns) become a variable and then I was able to join the two tables together so instead of having Col1 - Col29 I returned the binary tag created in the new table 1, 11, 111 etc which identified them.
It is a bit of mucking around but I couldn't think of another way to do it. I'm not sure if I should have been able to label the column headings with the new table results somehow.
Occasional Contributor jo1
Occasional Contributor
Posts: 14

Re: neat way to sum all Combinations

I have been able to adjust this code to use when I have larger values of n but this gets out of hand quickly.  Can you give a clue on how to alter this code to only look at combinations of a set number of values.  Eg I might have 5 columns but I don't want to look at all possible 31 combinations but only the combinations that have 3 sample points in each combination = 10 combinations.

 

Thanks

SAS Super FREQ
Posts: 3,476

Re: neat way to sum all Combinations

The easiest way is to use the ALLCOMB function, as @ChrisHemedinger suggested. See the references in his post. If you want to stay with the binary approach, restrict your attention to strings that have exactly three 1s.  If you need further help, I suggest you open a new thread, explain your problem, and post the code that you are using.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 351 views
  • 1 like
  • 4 in conversation