BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Ying
Fluorite | Level 6

Hello, 

 

If I have data with variable x1, x2, x3, and I need to find all the possible combination of 3 (there are 7 ).  Is there any elegant way to do this instead of list all the combination?  Because I might end up with 5 or 10 variables.

 

data test;

  input x1 x2 x3;

datalines;

1 2 3

3 4 5

;

run;

 

data all;

  set test;

 

/***is there any neat way to do this combination, instead of hard coded all the combinations?***/

 comb1 = x1;

comb2 = x2;

comb3 = x3;

comb12 = sum(x1, x2);

comb13 = sum(x1, x3);

comb23 = sum(x2, x3);

comb123 = sum(x1, x2, x3);

run;

 

thank you!

1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

Yes, you can do this, probably in several ways. The following generates all binary sequences between 1 and 2**3-1 = 7:

001

010

011

,,,

111

The sum that you want is coef[1]*x1 + coef[2]*x2 + coef[3]*x3, where coef[i] is the i_th element of the binary sequence (thought of as a number). You can use the SUBSTR function is used to extract the i_th coefficient.

 

 

data all;
set test;
array x[3] x1-x3;
array comb[7] c1-c7;
label c1="comb3" c2="comb2" c3="comb23"
      c4="comb1" c5="comb13" c6="comb12" c7="comb123";
n = dim(x);
do i = 1 to 2**n - 1;         /* count 1 to 2**n - 1 */
   s = put(i, Binary3.);      /* generate binary representation */
   sum = 0;
   do k = 1 to n;             /* sum = SUM of s[k]*x[k] */
      coef = substr(s,k,1);
      sum + coef*x[k];
   end;
   comb[i] = sum;             /* store sum for this combination of variables */
end;   
output;
keep x1-x3 c1-c7;
run;
proc print label; 
var x1-x3 c4 c2 c1 c6 c5 c3 c7;
run;

 

Notice that my algorithm does not generate the values in the same order as you did. (You computed sums of variables taken one at a time, taken two at a time, and taken three at a time.) If necessary, I'm sure others on the list can modify my idea to give the results in a different order.  To help compare my approach with your, I assigned labels to each variable that agree with your variable names.

 

The algorithm is explained further in the article "Creating a matrix with all combinations of zeros and ones."

View solution in original post

8 REPLIES 8
ChrisHemedinger
Community Manager

Have you looked at the ALLCOMB function?  @Rick_SAS also has a nice series of articles about generating combinations in SAS.

It's time to register for SAS Innovate! Join your SAS user peers in Las Vegas on April 16-19 2024.
Rick_SAS
SAS Super FREQ

Yes, you can do this, probably in several ways. The following generates all binary sequences between 1 and 2**3-1 = 7:

001

010

011

,,,

111

The sum that you want is coef[1]*x1 + coef[2]*x2 + coef[3]*x3, where coef[i] is the i_th element of the binary sequence (thought of as a number). You can use the SUBSTR function is used to extract the i_th coefficient.

 

 

data all;
set test;
array x[3] x1-x3;
array comb[7] c1-c7;
label c1="comb3" c2="comb2" c3="comb23"
      c4="comb1" c5="comb13" c6="comb12" c7="comb123";
n = dim(x);
do i = 1 to 2**n - 1;         /* count 1 to 2**n - 1 */
   s = put(i, Binary3.);      /* generate binary representation */
   sum = 0;
   do k = 1 to n;             /* sum = SUM of s[k]*x[k] */
      coef = substr(s,k,1);
      sum + coef*x[k];
   end;
   comb[i] = sum;             /* store sum for this combination of variables */
end;   
output;
keep x1-x3 c1-c7;
run;
proc print label; 
var x1-x3 c4 c2 c1 c6 c5 c3 c7;
run;

 

Notice that my algorithm does not generate the values in the same order as you did. (You computed sums of variables taken one at a time, taken two at a time, and taken three at a time.) If necessary, I'm sure others on the list can modify my idea to give the results in a different order.  To help compare my approach with your, I assigned labels to each variable that agree with your variable names.

 

The algorithm is explained further in the article "Creating a matrix with all combinations of zeros and ones."

Ying
Fluorite | Level 6

Thank you, RicK! 

jo1
Obsidian | Level 7 jo1
Obsidian | Level 7

I know this post was some time ago, but I found in useful in working out how to do a similar thing but with more variables.  The thing I'd like to be able to do but I can't think how is to know which of the combinations were 1, 2, 3 etc factor combinations - any suggestions of where to start looking for an answer.

 

Jo

Rick_SAS
SAS Super FREQ

I assume you figured out how to adjust the dimensions of the arrays and the width of the BINARYw. format., The important change is to add S to the KEEP statement. The variable S is a binary string that indicates which variables are in the combination. Thus '10011' represents x1, x2, and x5.

jo1
Obsidian | Level 7 jo1
Obsidian | Level 7
Thanks yes, after a bit I worked out to increase the dimensions of the arrays and adjusted the width of the BINARY. I did try adding s to the KEEP statement but I'm assuming it would have been just the last combination that it was showing.
But continuing that line of thought I created a new table with the number of combinations and their BINARYw number (only the 1's)
data new;
do i = 1 to 29;
s = compress(put(i,Binary5.),"0");
output;
I needed to transpose the data anyway so the Combinations (columns) become a variable and then I was able to join the two tables together so instead of having Col1 - Col29 I returned the binary tag created in the new table 1, 11, 111 etc which identified them.
It is a bit of mucking around but I couldn't think of another way to do it. I'm not sure if I should have been able to label the column headings with the new table results somehow.
jo1
Obsidian | Level 7 jo1
Obsidian | Level 7

I have been able to adjust this code to use when I have larger values of n but this gets out of hand quickly.  Can you give a clue on how to alter this code to only look at combinations of a set number of values.  Eg I might have 5 columns but I don't want to look at all possible 31 combinations but only the combinations that have 3 sample points in each combination = 10 combinations.

 

Thanks

Rick_SAS
SAS Super FREQ

The easiest way is to use the ALLCOMB function, as @ChrisHemedinger suggested. See the references in his post. If you want to stay with the binary approach, restrict your attention to strings that have exactly three 1s.  If you need further help, I suggest you open a new thread, explain your problem, and post the code that you are using.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 3134 views
  • 1 like
  • 4 in conversation