I think is an OR question.
Assume there are 18 numbers:
data tab1;
input x;
cards;
22.57
62.43
56.01
52.71
24.44
20.34
16
14.73
17.52
20.15
22.76
18.59
30.14
14.63
23.13
11
26.39
149.18
;
run;
What is the combination of these numbers whose summation is most close to 500?
That is to say, you are asked to find out combinations of these numbers, to make the summation of these numbers as close to 500 as possible.
Here is what I do:
%let n=18;
%let target=500;
data tab2;
array d[&n.]_temporary_;
array nx[&n.];
do i=1 by 1 until(eof);
set tab1 end=eof;
d[i]=x;
end;
do i=1 to dim(d);
ncomb=comb(dim(d),i);
do j=1 to ncomb;
call allcomb(j,i,of d[*]); *All Combinations;
*Calculate sum and distance to target of this combination;
sum=0;
do k=1 to i;
sum+d[k];
end;
abs_dif=abs(sum-&target.);
*Get minimum distance to target and responding combination;
if abs_dif<min_abs_dif or min_abs_dif=. then do;
min_abs_dif=abs_dif;
do k=1 to i;
nx[k]=d[k];
end;
end;
end;
end;
run;
The result is located at array NX.
My question is: How to slove this kind of problem(maybe it is Knapsack problem) using proc optmodel? Well, an elegant data step method is welcomed, too.
PS: I have checked Knapsack problem example in SAS doc, but it is too complex to understand.
Here are two ways, both using the MILP solver through PROC OPTMODEL:
/* automated linearization in SAS Viya */
proc optmodel;
set ITEMS;
num x {ITEMS};
read data tab1 into ITEMS=[_N_] x;
var UseItem {ITEMS} binary;
min Error = abs(sum {j in ITEMS} x[j]*UseItem[j] - &target);
solve linearize;
print x UseItem;
quit;
/* manual linearization */
proc optmodel;
set ITEMS;
num x {ITEMS};
read data tab1 into ITEMS=[_N_] x;
var UseItem {ITEMS} binary;
var Over >= 0;
var Under >= 0;
min Error = Over + Under;
con sum {j in ITEMS} x[j]*UseItem[j] - Over + Under = ⌖
solve;
print x UseItem;
quit;
Both yield optimal objective value 0 with the following solution:
[1] | x | UseItem |
---|---|---|
1 | 22.57 | 0 |
2 | 62.43 | 1 |
3 | 56.01 | 1 |
4 | 52.71 | 1 |
5 | 24.44 | 1 |
6 | 20.34 | 1 |
7 | 16.00 | 0 |
8 | 14.73 | 1 |
9 | 17.52 | 1 |
10 | 20.15 | 1 |
11 | 22.76 | 1 |
12 | 18.59 | 1 |
13 | 30.14 | 1 |
14 | 14.63 | 0 |
15 | 23.13 | 0 |
16 | 11.00 | 1 |
17 | 26.39 | 0 |
18 | 149.18 | 1 |
It's called the Subset Sum Problem!
Koen
Here are two ways, both using the MILP solver through PROC OPTMODEL:
/* automated linearization in SAS Viya */
proc optmodel;
set ITEMS;
num x {ITEMS};
read data tab1 into ITEMS=[_N_] x;
var UseItem {ITEMS} binary;
min Error = abs(sum {j in ITEMS} x[j]*UseItem[j] - &target);
solve linearize;
print x UseItem;
quit;
/* manual linearization */
proc optmodel;
set ITEMS;
num x {ITEMS};
read data tab1 into ITEMS=[_N_] x;
var UseItem {ITEMS} binary;
var Over >= 0;
var Under >= 0;
min Error = Over + Under;
con sum {j in ITEMS} x[j]*UseItem[j] - Over + Under = ⌖
solve;
print x UseItem;
quit;
Both yield optimal objective value 0 with the following solution:
[1] | x | UseItem |
---|---|---|
1 | 22.57 | 0 |
2 | 62.43 | 1 |
3 | 56.01 | 1 |
4 | 52.71 | 1 |
5 | 24.44 | 1 |
6 | 20.34 | 1 |
7 | 16.00 | 0 |
8 | 14.73 | 1 |
9 | 17.52 | 1 |
10 | 20.15 | 1 |
11 | 22.76 | 1 |
12 | 18.59 | 1 |
13 | 30.14 | 1 |
14 | 14.63 | 0 |
15 | 23.13 | 0 |
16 | 11.00 | 1 |
17 | 26.39 | 0 |
18 | 149.18 | 1 |
Hello Rob,
How to create the result as a dataset including variable x and UseItem other than just printing them out?
create data want from [item] x UseItem;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.