turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Base SAS Programming
- /
- neat way to sum all Combinations

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

06-17-2016 11:52 AM

Hello,

If I have data with variable x1, x2, x3, and I need to find all the possible combination of 3 (there are 7 ). Is there any elegant way to do this instead of list all the combination? Because I might end up with 5 or 10 variables.

data test;

input x1 x2 x3;

datalines;

1 2 3

3 4 5

;

run;

data all;

set test;

/***is there any neat way to do this combination, instead of hard coded all the combinations?***/

comb1 = x1;

comb2 = x2;

comb3 = x3;

comb12 = sum(x1, x2);

comb13 = sum(x1, x3);

comb23 = sum(x2, x3);

comb123 = sum(x1, x2, x3);

run;

thank you!

Accepted Solutions

Solution

06-17-2016
02:25 PM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ying

06-17-2016 01:53 PM - edited 06-17-2016 01:57 PM

Yes, you can do this, probably in several ways. The following generates all binary sequences between 1 and 2**3-1 = 7:

001

010

011

,,,

111

The sum that you want is coef[1]*x1 + coef[2]*x2 + coef[3]*x3, where coef[i] is the i_th element of the binary sequence (thought of as a number). You can use the SUBSTR function is used to extract the i_th coefficient.

```
data all;
set test;
array x[3] x1-x3;
array comb[7] c1-c7;
label c1="comb3" c2="comb2" c3="comb23"
c4="comb1" c5="comb13" c6="comb12" c7="comb123";
n = dim(x);
do i = 1 to 2**n - 1; /* count 1 to 2**n - 1 */
s = put(i, Binary3.); /* generate binary representation */
sum = 0;
do k = 1 to n; /* sum = SUM of s[k]*x[k] */
coef = substr(s,k,1);
sum + coef*x[k];
end;
comb[i] = sum; /* store sum for this combination of variables */
end;
output;
keep x1-x3 c1-c7;
run;
proc print label;
var x1-x3 c4 c2 c1 c6 c5 c3 c7;
run;
```

Notice that my algorithm does not generate the values in the same order as you did. (You computed sums of variables taken one at a time, taken two at a time, and taken three at a time.) If necessary, I'm sure others on the list can modify my idea to give the results in a different order. To help compare my approach with your, I assigned labels to each variable that agree with your variable names.

The algorithm is explained further in the article "Creating a matrix with all combinations of zeros and ones."

All Replies

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ying

06-17-2016 01:09 PM

Have you looked at the ALLCOMB function? @Rick_SAS also has a nice series of articles about generating combinations in SAS.

Solution

06-17-2016
02:25 PM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ying

06-17-2016 01:53 PM - edited 06-17-2016 01:57 PM

Yes, you can do this, probably in several ways. The following generates all binary sequences between 1 and 2**3-1 = 7:

001

010

011

,,,

111

The sum that you want is coef[1]*x1 + coef[2]*x2 + coef[3]*x3, where coef[i] is the i_th element of the binary sequence (thought of as a number). You can use the SUBSTR function is used to extract the i_th coefficient.

```
data all;
set test;
array x[3] x1-x3;
array comb[7] c1-c7;
label c1="comb3" c2="comb2" c3="comb23"
c4="comb1" c5="comb13" c6="comb12" c7="comb123";
n = dim(x);
do i = 1 to 2**n - 1; /* count 1 to 2**n - 1 */
s = put(i, Binary3.); /* generate binary representation */
sum = 0;
do k = 1 to n; /* sum = SUM of s[k]*x[k] */
coef = substr(s,k,1);
sum + coef*x[k];
end;
comb[i] = sum; /* store sum for this combination of variables */
end;
output;
keep x1-x3 c1-c7;
run;
proc print label;
var x1-x3 c4 c2 c1 c6 c5 c3 c7;
run;
```

Notice that my algorithm does not generate the values in the same order as you did. (You computed sums of variables taken one at a time, taken two at a time, and taken three at a time.) If necessary, I'm sure others on the list can modify my idea to give the results in a different order. To help compare my approach with your, I assigned labels to each variable that agree with your variable names.

The algorithm is explained further in the article "Creating a matrix with all combinations of zeros and ones."

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rick_SAS

06-17-2016 02:24 PM

Thank you, RicK!

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rick_SAS

05-30-2017 01:22 AM

I know this post was some time ago, but I found in useful in working out how to do a similar thing but with more variables. The thing I'd like to be able to do but I can't think how is to know which of the combinations were 1, 2, 3 etc factor combinations - any suggestions of where to start looking for an answer.

Jo

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-30-2017 06:21 AM

I assume you figured out how to adjust the dimensions of the arrays and the width of the BINARYw. format., The important change is to add S to the KEEP statement. The variable S is a binary string that indicates which variables are in the combination. Thus '10011' represents x1, x2, and x5.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rick_SAS

05-30-2017 06:40 PM

Thanks yes, after a bit I worked out to increase the dimensions of the arrays and adjusted the width of the BINARY. I did try adding s to the KEEP statement but I'm assuming it would have been just the last combination that it was showing.

But continuing that line of thought I created a new table with the number of combinations and their BINARYw number (only the 1's)

data new;

do i = 1 to 29;

s = compress(put(i,Binary5.),"0");

output;

I needed to transpose the data anyway so the Combinations (columns) become a variable and then I was able to join the two tables together so instead of having Col1 - Col29 I returned the binary tag created in the new table 1, 11, 111 etc which identified them.

It is a bit of mucking around but I couldn't think of another way to do it. I'm not sure if I should have been able to label the column headings with the new table results somehow.

But continuing that line of thought I created a new table with the number of combinations and their BINARYw number (only the 1's)

data new;

do i = 1 to 29;

s = compress(put(i,Binary5.),"0");

output;

I needed to transpose the data anyway so the Combinations (columns) become a variable and then I was able to join the two tables together so instead of having Col1 - Col29 I returned the binary tag created in the new table 1, 11, 111 etc which identified them.

It is a bit of mucking around but I couldn't think of another way to do it. I'm not sure if I should have been able to label the column headings with the new table results somehow.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rick_SAS

06-19-2017 11:51 PM

I have been able to adjust this code to use when I have larger values of n but this gets out of hand quickly. Can you give a clue on how to alter this code to only look at combinations of a set number of values. Eg I might have 5 columns but I don't want to look at all possible 31 combinations but only the combinations that have 3 sample points in each combination = 10 combinations.

Thanks

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

06-20-2017 06:00 AM

The easiest way is to use the ALLCOMB function, as @ChrisHemedinger suggested. See the references in his post. If you want to stay with the binary approach, restrict your attention to strings that have exactly three 1s. If you need further help, I suggest you open a new thread, explain your problem, and post the code that you are using.