I need to get all combinations of all variables with domain (from incl. 0 to 100 incl. step 10 - namely 0,10,20,30,40.... 100) but only those sets which sum =100 (eg. for 6 variables one of combinations would be getting all combinations which sum =100)
I know that there's an allcomb function that does exacly that but i returns combinations without that condition and this condition dramatically decreases combinations returned.
Maybe an option is recurrsive FCMP function but I don't know how to write that in SAS.
Help would be appriciated
If you have SAS/OR, you should notice that your problem is an integer cutting stock problem and exploit the solutions provided by proc clp. Try for instance :
proc clp findall out=soln;
variable (x1-x4) = [0,10];
linear x1 + x2 + x3 + x4 = 10;
run;
proc print data=soln; run;
PG
I suggest that you post an abbreviated example of your data, variables and desired output. Are you looking at a total of variables within a single record or across multiple records? How many variables do you have. Are all values positive? How do variables with missing values get treated? Values of 0?
Your mention of a domain of 0 to 100 by 10 does not look like a request for variables but of values, which would be an entirely different beast.
Find all the possible splits in the sequence 1,...,10 and count their sizes. Then, eliminate duplicates. The result is expressed as n1 = the number of ones, n2 = the number of twos, etc. In theory, it will work for N up to 63, but I wouldn't try it with N>20 :
data comb(keep=n:);
array n{*} n1-n10;
format n: 2.;
length c $9 b $1;
do k = 0 to 2**(10-2)-1;
c = put(k, binary9.);
do b = "0", "1";
do i = 1 to 10; n{i} = 0; end;
j = 1;
do i = 1 to 9;
if char(c, i) = b then do;
n{j} + 1;
j = 0;
end;
j + 1;
end;
n{j} + 1;
output;
end;
end;
run;
proc sort data=comb nodupkey; by n:; run;
proc print data=comb; run;
-------------------------------------------------------------
The SAS System 20:50 Friday, April 6, 2012 1
Obs n1 n2 n3 n4 n5 n6 n7 n8 n9 n10
1 0 0 0 0 0 0 0 0 0 1
2 0 0 0 0 2 0 0 0 0 0
3 0 0 0 1 0 1 0 0 0 0
4 0 0 1 0 0 0 1 0 0 0
5 0 0 2 1 0 0 0 0 0 0
6 0 1 0 0 0 0 0 1 0 0
7 0 1 0 2 0 0 0 0 0 0
8 0 1 1 0 1 0 0 0 0 0
9 0 2 0 0 0 1 0 0 0 0
10 0 2 2 0 0 0 0 0 0 0
11 0 3 0 1 0 0 0 0 0 0
12 0 5 0 0 0 0 0 0 0 0
13 1 0 0 0 0 0 0 0 1 0
14 1 0 0 1 1 0 0 0 0 0
15 1 0 1 0 0 1 0 0 0 0
16 1 0 3 0 0 0 0 0 0 0
17 1 1 0 0 0 0 1 0 0 0
18 1 1 1 1 0 0 0 0 0 0
19 1 2 0 0 1 0 0 0 0 0
20 1 3 1 0 0 0 0 0 0 0
21 2 0 0 0 0 0 0 1 0 0
22 2 0 0 2 0 0 0 0 0 0
23 2 0 1 0 1 0 0 0 0 0
24 2 1 0 0 0 1 0 0 0 0
25 2 1 2 0 0 0 0 0 0 0
26 2 2 0 1 0 0 0 0 0 0
27 2 4 0 0 0 0 0 0 0 0
28 3 0 0 0 0 0 1 0 0 0
29 3 0 1 1 0 0 0 0 0 0
30 3 1 0 0 1 0 0 0 0 0
31 3 2 1 0 0 0 0 0 0 0
32 4 0 0 0 0 1 0 0 0 0
33 4 0 2 0 0 0 0 0 0 0
34 4 1 0 1 0 0 0 0 0 0
35 4 3 0 0 0 0 0 0 0 0
36 5 0 0 0 1 0 0 0 0 0
37 5 1 1 0 0 0 0 0 0 0
38 6 0 0 1 0 0 0 0 0 0
39 6 2 0 0 0 0 0 0 0 0
40 7 0 1 0 0 0 0 0 0 0
41 8 1 0 0 0 0 0 0 0 0
42 10 0 0 0 0 0 0 0 0 0
Example please?
For example - I need all combinations of 3 variables v1, v2, v3 (each can be either of these values: 0,10,20,30,40.... 100) but only combinations in which v1+v2+v3 =100
v1 v2 v3 sum
10 80 10 100 - OK
10 90 10 110 - I don't want this combiantion
30 30 40 100 -OK
This is an example. I need to have possibility to make combinations for N variables. The clue is not to make all combinations and afterwards eliminating those that don't meet conditions - the number of all combinations is much greater then the number of combinations with sum=100 - So during generation of combinations sum should be checked to optimize generation time.
Is this a real life problem? Seems kind of odd...
If you plan to create you combinations within a data step, you could use a subsetting if with the sum() function.
If you var1-varn uses such naming convention you can easily use the sum(of..) construct. Otherwise, link your variables to an array, which you then sum up.
if sum(of v:) = 100;
Linus, I've seen something like this as a real-life problem. Given up to 20 recent transactions, find any combinations that sum to 0. The idea is that some transactions may be refunds for a combination of other purchases. You need to match the refund to the purchases.
At any rate, here's an approach simplified for 3 variables. The idea is to find combinations of v1-v3 that sum to 100.
do in1 = 'OUT', 'IN';
do in2 = 'OUT', 'IN';
do in3 = 'OUT', 'IN';
total = v1 * (in1='IN') + v2 * (in2='IN') + v3 * (in3='IN');
if total=100 then output;
end;
end;
end;
You can always switch to arrays if the number of variables becomes large. But note that your program canl chug for a while as the number of variables increases.
If you have to, you can always place this code inside a set of loops like this:
do v1=0 to 100 by 10;
do v2=0 to 100 by 10;
do v3=0 to 100 by 10;
** Above code;
end;
end;
end;
I'm just not certain whether this is part of the problem though.
Good luck!
The problem is that I need to calculate that with given number of variables. I need a macro that has N-variables parameter on entry. In that case depending on N I will have N loops
tom,
If I understand the problem correctly this time, you always want to get the sum of all N variables. That's relatively easy for a macro to generate, but as you have noted it may involve a great number of combinations.
Assuming you have a parameter &N with the number of variables, this belongs inside a macro definition:
%local i;
%do i=1 %to &N;
do v&i = 0 to 100 by 10;
%end;
if sum (of v1-v&N)=100 then output;
%do i=1 %to &N;
end;
%end;
Good luck.
There's something wrong with the syntax. Which ones should be of macro style (%if) and which of data step style (without %)? As far as i know if you change "if sum (of v1-v&N)=100 then output;" to macro style then you can't do arithetics.
Thanks for help anyway.
tom,
The syntax is right. Let me put it in context for you, again using just 3 variables. I'm not saying this is the fastest way, but it is easy to program (expecially if you don't have SAS/OR). Here's the final generated program for N=3:
data solutions;
do v1=0 to 100 by 10;
do v2=0 to 100 by 10;
do v3 = 0 to 100 by 10;
if sum(of v1-v3)=100 then output;
end;
end;
end;
run;
So this doesn't try to reduce the number of permutations in any way. A more full-blown macro would let you pass a value for &N. Maybe you can picture this using N=3, and compare it to the code above:
%macro add_to_100 (N);
data solutions;
%local i;
%do i=1 %to &N;
do v&i = 0 to 100 by 10;
%end;
if sum(of v1-v&N) = 100 then output;
%do i=1 %o &N;
end;
%end;
run;
%mend add_to_100;
I know you have a solution ... I'm just dotting the i's and crossing the t's here.
If you have SAS/OR, you should notice that your problem is an integer cutting stock problem and exploit the solutions provided by proc clp. Try for instance :
proc clp findall out=soln;
variable (x1-x4) = [0,10];
linear x1 + x2 + x3 + x4 = 10;
run;
proc print data=soln; run;
PG
works thank, but is there a possibility to enter near linear some generic contraint with meaning: sum of all x variables like: linear sum (x
Then I wouldn't have to generate dynamically constraint with dynamic number of variables. I will have N variables - N can change and will be a input parameter of a macro.
As I have shown above, there is no need to check the sum at generation time if all integers are allowed in the combination; you only need to figure out all the possible subsets of integers between 1 and 10. What I don't understand is what kind of advantage you may gain in having that set of possible combinations. Checking that a given combination is a member of the set is a lot more expensive than simply checking the sum.
PG
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.