Suppose that you have the following data set:
data temp;
do set = 1 to 4;
do count = 0 to 5;
output;
end;
end;
run;
Now, you want to get all possible combinations of sets from this data set such that the sum of the "count" is 5. You have to include each "set" one time.
For example, I want:
combination | set | count |
1 | 1 | 5 |
1 | 2 | 0 |
1 | 3 | 0 |
1 | 4 | 0 |
2 | 1 | 4 |
2 | 2 | 1 |
2 | 3 | 0 |
2 | 4 | 0 |
3 | 1 | 4 |
3 | 2 | 0 |
3 | 3 | 1 |
3 | 4 | 0 |
etc | etc | etc |
Any suggestions on how to accomplish this would be appreciated. TIA!
Just to add a data step solution.
First, if you really have count=0 through 5 for each and every set, then you don't need data set TEMP (as in the proc optmodel solutions):
data want (keep=set1-set4);
sum0=0;
do set1=0 to 5;
sum1=set1+sum0;
if sum1>5 then continue;
do set2=0 to 5;
sum2=sum1+set2;
if sum2>5 then continue;
do set3=0 to 5;
sum3=sum2+set3;
if sum3>5 then continue;
do set4=0 to 5;
sum4=sum3+set4;
if sum4>5 then continue;
if sum4=5 then output;
end;
end;
end;
end;
run;
As I said, this solution doesn't require data set TEMP. But let's say data set TEMP has some "holes", i.e. not every SET has every COUNT from 0 through 5: Then you need to read TEMP to determine which combinations to skip (the if x{n,setn}=. then continue; statements below):
data want (keep=set1-set4);
array x {4,0:5} _temporary_;
set temp end=end_of_temp;
x{set,count}=1;
if end_of_temp;
sum0=0;
do set1=0 to 5;
if x{1,set1}=. then continue;
sum1=set1+sum0;
if sum1>5 then continue;
do set2=0 to 5;
if x{2,set2}=. then continue;
sum2=sum1+set2;
if sum2>5 then continue;
do set3=0 to 5;
if x{3,set3}=. then continue;
sum3=sum2+set3;
if sum3>5 then continue;
do set4=0 to 5;
if x{4,set4}=. then continue;
sum4=sum3+set4;
if sum4>5 then continue;
if sum4=5 then output;
end;
end;
end;
end;
run;
Conversion to macro code can shorten the above. And it would be more amenable to a larger number of sets (just change the %let nsets=4; statement below), or a change in the count range, or target value:
%macro test;
%let nsets=4;
%let countrange=0:5;
%let target=5;
data want (keep=set1-set&nsets);
array x {&nsets,&countrange} _temporary_;
set temp end=end_of_temp;
x{set,count}=1;
if end_of_temp;
sum0=0;
%do s=1 %to &nsets;
do set&s = lbound(x,2) to hbound(x,2);
if x{&s,set&s}=. then continue;
sum&s=set&s + sum%eval(&s-1);
if sum&s>&target then continue;
%end;
if sum&nsets=&target then output;
%do s=1 %to &nsets;
end;
%end;
run;
%mend test;
%test;
Note
the statement
sum&s=set&s + sum%eval(&s-1);
is converted to
sum1=set1 + sum0;
sum2=set2 + sum1;
etc.
I am not seeing how your "want" complies with " sum of the "count" is 5." . You likely have to go into a bit more detail of exactly what "You have to include each "set" one time." means as well.
If you mean that you have other data than your example data set it is quite possible the "sum of count=5" may not occur for "set". So what would be the rule for selecting the one that does not match.
In the example data set I have:
Set 1: 5+0+0+0=5
Set 2: 4+1+0+0=5
Set 3: 4+0+1+0=5
No, there is no other data.
Just to add a data step solution.
First, if you really have count=0 through 5 for each and every set, then you don't need data set TEMP (as in the proc optmodel solutions):
data want (keep=set1-set4);
sum0=0;
do set1=0 to 5;
sum1=set1+sum0;
if sum1>5 then continue;
do set2=0 to 5;
sum2=sum1+set2;
if sum2>5 then continue;
do set3=0 to 5;
sum3=sum2+set3;
if sum3>5 then continue;
do set4=0 to 5;
sum4=sum3+set4;
if sum4>5 then continue;
if sum4=5 then output;
end;
end;
end;
end;
run;
As I said, this solution doesn't require data set TEMP. But let's say data set TEMP has some "holes", i.e. not every SET has every COUNT from 0 through 5: Then you need to read TEMP to determine which combinations to skip (the if x{n,setn}=. then continue; statements below):
data want (keep=set1-set4);
array x {4,0:5} _temporary_;
set temp end=end_of_temp;
x{set,count}=1;
if end_of_temp;
sum0=0;
do set1=0 to 5;
if x{1,set1}=. then continue;
sum1=set1+sum0;
if sum1>5 then continue;
do set2=0 to 5;
if x{2,set2}=. then continue;
sum2=sum1+set2;
if sum2>5 then continue;
do set3=0 to 5;
if x{3,set3}=. then continue;
sum3=sum2+set3;
if sum3>5 then continue;
do set4=0 to 5;
if x{4,set4}=. then continue;
sum4=sum3+set4;
if sum4>5 then continue;
if sum4=5 then output;
end;
end;
end;
end;
run;
Conversion to macro code can shorten the above. And it would be more amenable to a larger number of sets (just change the %let nsets=4; statement below), or a change in the count range, or target value:
%macro test;
%let nsets=4;
%let countrange=0:5;
%let target=5;
data want (keep=set1-set&nsets);
array x {&nsets,&countrange} _temporary_;
set temp end=end_of_temp;
x{set,count}=1;
if end_of_temp;
sum0=0;
%do s=1 %to &nsets;
do set&s = lbound(x,2) to hbound(x,2);
if x{&s,set&s}=. then continue;
sum&s=set&s + sum%eval(&s-1);
if sum&s>&target then continue;
%end;
if sum&nsets=&target then output;
%do s=1 %to &nsets;
end;
%end;
run;
%mend test;
%test;
Note
the statement
sum&s=set&s + sum%eval(&s-1);
is converted to
sum1=set1 + sum0;
sum2=set2 + sum1;
etc.
Hello @cminard,
You can create a Cartesian product (view) with PROC SQL and then transpose the result with a DATA step.
proc sql;
create view _tmp as
select t1.count as a1,
t2.count as a2,
t3.count as a3,
t4.count as a4
from temp(where=(set=1)) t1,
temp(where=(set=2)) t2,
temp(where=(set=3)) t3,
temp(where=(set=4)) t4
having a1+a2+a3+a4=5
order by 1 desc, 2 desc, 3 desc, 4 desc;
quit;
data want(drop=a:);
set _tmp;
array a[*] a:;
combination=_n_;
do set=1 to dim(a);
count=a[set];
output;
end;
run;
It is OR things. Try post it at OR forum and calling @RobPratt
proc optmodel; num numLists=4; set LIST{1..numLists}; LIST[1]=0..5; LIST[2]=0..5; LIST[3]=0..5; LIST[4]=0..5; num target=5; var x{i in 1..numLIsts,j in LIST[i]} binary; con oneitemperlist{i in 1..numLists} : sum{j in LIST[i]} x[i,j]=1; con sumtotarget:sum{i in 1..numLists,j in LIST[i]} j*x[i,j]=target; solve with clp/findallsolns; set SUPPORT{s in 1.._NSOL_}={i in 1..numLists,j in LIST[i]:x[i,j].sol[s]>0.5}; put SUPPORT[*]; quit;
{<1,5>,<2,0>,<3,0>,<4,0>} {<1,4>,<2,0>,<3,0>,<4,1>} {<1,4>,<2,1>,<3,0>,<4,0>} {<1,4>,<2,0>,<3,1>,<4,0>} {<1,3>,<2,1>,<3,0>,<4,1>} {< 1,3>,<2,0>,<3,1>,<4,1>} {<1,3>,<2,1>,<3,1>,<4,0>} {<1,3>,<2,0>,<3,0>,<4,2>} {<1,3>,<2,2>,<3,0>,<4,0>} {<1,3>,<2,0>,<3,2>,<4,0>} {<1, 2>,<2,1>,<3,1>,<4,1>} {<1,2>,<2,1>,<3,0>,<4,2>} {<1,2>,<2,0>,<3,1>,<4,2>} {<1,2>,<2,2>,<3,0>,<4,1>} {<1,2>,<2,2>,<3,1>,<4,0>} {<1,2> ,<2,0>,<3,2>,<4,1>} {<1,2>,<2,1>,<3,2>,<4,0>} {<1,2>,<2,0>,<3,3>,<4,0>} {<1,2>,<2,3>,<3,0>,<4,0>} {<1,2>,<2,0>,<3,0>,<4,3>} {<1,1>,< 2,1>,<3,1>,<4,2>} {<1,1>,<2,2>,<3,0>,<4,2>} {<1,1>,<2,2>,<3,1>,<4,1>} {<1,1>,<2,0>,<3,2>,<4,2>} {<1,1>,<2,2>,<3,2>,<4,0>} {<1,1>,<2, 1>,<3,2>,<4,1>} {<1,1>,<2,0>,<3,3>,<4,1>} {<1,1>,<2,1>,<3,3>,<4,0>} {<1,1>,<2,3>,<3,0>,<4,1>} {<1,1>,<2,3>,<3,1>,<4,0>} {<1,1>,<2,0> ,<3,1>,<4,3>} {<1,1>,<2,1>,<3,0>,<4,3>} {<1,1>,<2,0>,<3,4>,<4,0>} {<1,1>,<2,0>,<3,0>,<4,4>} {<1,1>,<2,4>,<3,0>,<4,0>} {<1,0>,<2,2>,< 3,1>,<4,2>} {<1,0>,<2,1>,<3,2>,<4,2>} {<1,0>,<2,2>,<3,2>,<4,1>} {<1,0>,<2,0>,<3,3>,<4,2>} {<1,0>,<2,1>,<3,3>,<4,1>} {<1,0>,<2,2>,<3, 3>,<4,0>} {<1,0>,<2,3>,<3,0>,<4,2>} {<1,0>,<2,3>,<3,1>,<4,1>} {<1,0>,<2,3>,<3,2>,<4,0>} {<1,0>,<2,2>,<3,0>,<4,3>} {<1,0>,<2,0>,<3,2> ,<4,3>} {<1,0>,<2,1>,<3,1>,<4,3>} {<1,0>,<2,1>,<3,4>,<4,0>} {<1,0>,<2,0>,<3,4>,<4,1>} {<1,0>,<2,0>,<3,1>,<4,4>} {<1,0>,<2,1>,<3,0>,< 4,4>} {<1,0>,<2,4>,<3,1>,<4,0>} {<1,0>,<2,4>,<3,0>,<4,1>} {<1,0>,<2,0>,<3,0>,<4,5>} {<1,0>,<2,0>,<3,5>,<4,0>} {<1,0>,<2,5>,<3,0>,<4, 0>}
解汇总 Solver CLP 变量选择 MINR Objective Function (0) Solution Status All Solutions Found Objective Value 0 Solutions Found 56 Presolve Time 0.00 Solution Time 0.01
That works, or you can use 4 integer variables and 1 linear constraint:
proc optmodel;
num numLists = 4;
num target = 5;
var X {1..numLists} >= 0 <= target integer;
con SumToTarget: sum {i in 1..numLists} X[i] = target;
solve with clp / findallsolns;
create data want from [combination set]={s in 1.._NSOL_, i in 1..numLists} count=X[i].sol[s];
quit;
proc sort data=want;
by combination set;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.