Xia - The solver code in SAS/OR is not a data step. It is written in C.
The algorithm used in SAS/OR is a variant of the branch-and-cut algorithm for integer programming. You can see some high level information about the algorithm here: SAS/OR(R) 13.2 User's Guide: Mathematical Programming. Or, there are many textbooks on the topic - see, for example: Integer and Combinatorial Optimization: Laurence A. Wolsey, George L. Nemhauser: 9780471359432: Amaz...
Hi Matt,
Thanks for sharing that book . Hope one day I could buy one and have time to read about it .
I ended up with a better and faster BACKWARDS algorithm (although it is not best algorithm).
For this tiny example, on my machine,
data have; input Player Position $ Points Price ; cards; 1 G 2 8000 2 D 4 7000 3 M 6 12000 4 A 8 10000 5 G 4 9000 6 A 9 14000 8 D 4 8000 9 M 6 4000 10 A 8 10000 12 A 9 14000 13 D 4 7000 14 M 6 12000 17 A 9 14000 18 G 2 8000 19 D 4 7000 20 M 6 12000 24 D 4 7000 25 M 6 12000 29 D 4 7000 30 M 6 7000 32 M 6 1000 36 D 4 7000 37 M 6 12000 42 D 4 7000 43 M 6 12000 45 G 4 9000 46 A 9 14000 47 D 4 7000 48 M 6 12000 55 D 4 7000 56 M 6 12000 60 D 4 6000 61 M 6 9000 63 M 6 2000 67 D 4 5000 68 M 6 9000 70 M 8 6000 ; run; proc sort data=have ;by position descending Points Price;run; data _null_; set have; by position ; if last.position then call symputx(position,_n_); run; data _null_; set have end=last; length list $ 100 point_sum point_max cost _cost 8 ; array pla{&M} _temporary_ ; array pos{&M} $ _temporary_ ; array poi{&M} _temporary_ ; array pri{&M} _temporary_ ; pla{_n_}=Player; pos{_n_}=Position; poi{_n_}=Points; pri{_n_}=Price; if last then do; offset=-1; do until(offset=max(&A-2,&D-&A-4,&G-&D-1,&M-&G-4)); offset+1; do i1=1 to min(2+offset,&A) ; do i2=i1+1 to min(2+offset,&A) ; do j1=%eval(&A+1) to min(&A+4+offset,&D) ; do j2=j1+1 to min(&A+4+offset,&D); do j3=j2+1 to min(&A+4+offset,&D) ; do j4=j3+1 to min(&A+4+offset,&D) ; do m=%eval(&D+1) to min(&D+1+offset,&G) ; do n1=%eval(&G+1) to min(&G+4+offset,&M) ; do n2=n1+1 to min(&G+4+offset,&M) ; do n3=n2+1 to min(&G+4+offset,&M) ; do n4=n3+1 to min(&G+4+offset,&M) ; if i1=min(2+offset,&A) or i2=min(2+offset,&A) or j1=min(&A+4+offset,&D) or j2=min(&A+4+offset,&D) or j3=min(&A+4+offset,&D) or j4=min(&A+4+offset,&D) or m=min(&D+1+offset,&G) or n1=min(&G+4+offset,&M) or n2=min(&G+4+offset,&M) or n3=min(&G+4+offset,&M) or n4=min(&G+4+offset,&M) then do; point_sum=sum(poi{i1},poi{i2},poi{j1},poi{j2},poi{j3},poi{j4},poi{m},poi{n1},poi{n2},poi{n3},poi{n4}); cost=sum(pri{i1},pri{i2},pri{j1},pri{j2},pri{j3},pri{j4},pri{m},pri{n1},pri{n2},pri{n3},pri{n4}); if point_sum gt point_max and cost le 75000 then do; point_max=point_sum; _cost=cost; list=catx('|',pla{i1},pla{i2},pla{j1},pla{j2},pla{j3},pla{j4},pla{m},pla{n1},pla{n2},pla{n3},pla{n4}); end; if not missing(list) then do; putlog 'Players : ' list 'Max Points : ' point_max 'Cost : ' _cost; stop; end; end; end; end; end; end; end; end; end; end; end; end; end; end; putlog 'Not found solution.'; end; run; Players : 6|12|67|60|2|13|5|70|32|63|9 Max Points : 64 Cost : 75000
Xia Keshan
Message was edited by: xia keshan
Message was edited by: xia keshan
Good job.
Unfortunately, I think your approach might start to struggle if you change any of the rules associated with the model (for example, try to apply that to a different sport for Fantasy, or change a rule), or if you scale the problem up to more realistic sizes (since you are using "brute force").
As an example, let's consider a larger field of players (more realistic for Fantasy Sports applications). We can simulate that as such:
%macro createData(n=15);
data have(drop=i rename=(count=player));
retain count 0;
set have;
do i = 1 to &n;
count+1;
playerNumber = player;
output;
end;
run;
%mend createData;
%createData();
Running your code took 49 minutes.
Players : 76|77|511|512|513|514|61|541|542|301|302 Max Points : 66 Cost : 71000
NOTE: DATA statement used (Total process time):
real time 49:02.72
cpu time 49:02.69
And running my code took: 0.08s
NOTE: Optimal.
NOTE: Objective = 68.
NOTE: PROCEDURE OPTMODEL used (Total process time):
real time 0.08 seconds
cpu time 0.07 seconds
Also, are you sure your code is exact? The OR solver found a better solution (68).
169 %macro createData(n=15);
170 data have(drop=i rename=(count=player));
171 retain count 0;
172 set have;
173 do i = 1 to &n;
174 count+1;
175 playerNumber = player;
176 output;
177 end;
178 run;
179 %mend createData;
180 %createData()
WARNING: Variable Player already exists on file WORK.HAVE.
Why I would got this message.
As I said . My algorithm is Backward , so if the solution is at high level , my code would be very fast, whereas if it is at low level ,mine would cost lots of time. But that is the start point problem.
"brute force" , I don't agree with that, I am partially brute force . first four member then five member .....six .... .If the solution is at the top ,that would be very fast.
Of courese. I have a long way to walk.
Yeah. You are right. I need to reconsider it for a while .
Message was edited by: xia keshan
Haha Matt,
I ended up with Genetic Algorithm . It is so sharp and powerful and almost be able to solve all kind of optimum problem.
Which means You can solve all the optimum problems without any money if you have SAS University Edition.Code is written in IML.
It is so amazing and so fast (0.4 sec). I love it too much.
data have;
input Player Position $ Points Price ;
cards;
1 G 2 8000
2 D 4 7000
3 M 6 12000
4 A 8 10000
5 G 4 9000
6 A 9 14000
8 D 4 8000
9 M 6 4000
10 A 8 10000
12 A 9 14000
13 D 4 7000
14 M 6 12000
17 A 9 14000
18 G 2 8000
19 D 4 7000
20 M 6 12000
24 D 4 7000
25 M 6 12000
29 D 4 7000
30 M 6 7000
32 M 6 1000
36 D 4 7000
37 M 6 12000
42 D 4 7000
43 M 6 12000
45 G 4 9000
46 A 9 14000
47 D 4 7000
48 M 6 12000
55 D 4 7000
56 M 6 12000
60 D 4 6000
61 M 6 9000
63 M 6 2000
67 D 4 5000
68 M 6 9000
70 M 8 6000
;
run;
%macro createData(n=15);
data have(drop=i rename=(count=player));
set have(drop=player);
do i = 1 to &n;
count+1;
output;
end;
run;
%mend createData;
%createData()
options fullstimer;
proc sql noprint;
select count(*) into : A from have where Position='A';
select count(*) into : D from have where Position='D';
select count(*) into : G from have where Position='G';
select count(*) into : M from have where Position='M';
quit;
proc iml;
use have;
read all var _all_ where(Position='A');
Player_A=Player;Point_A=Points;Pice_A=Price;
read all var _all_ where(Position='D');
Player_D=Player;Point_D=Points;Pice_D=Price;
read all var _all_ where(Position='G');
Player_G=Player;Point_G=Points;Pice_G=Price;
read all var _all_ where(Position='M');
Player_M=Player;Point_M=Points;Pice_M=Price;
close have;
start football(x) global(Point_A,Pice_A,Point_D,Pice_D,Point_G,Pice_G,Point_M,Pice_M);
points=sum(Point_A[x[1:2]],Point_D[x[3:6]],Point_G[x[7]],Point_M[x[8:11]]);
sum_cost=sum(Pice_A[x[1:2]],Pice_D[x[3:6]],Pice_G[x[7]],Pice_M[x[8:11]]);
if sum_cost>75000 then points=1;
return (points);
finish;
start uniform_cross(child1, child2, parent1, parent2);
child1 = parent1;
child2 = parent2;
do i = 1 to ncol(parent1);
r = uniform(1234);
if r<=0.5 then do;
child1 = parent2;
child2 = parent1;
end;
end;
finish;
id=gasetup(2,11,123);
call gasetobj(id,1,"football");
call gasetcro(id,1.0,0,"uniform_cross");
call gasetmut(id,0.2,1);
call gasetsel(id,3,1,0.95);
call gainit(id,2000,{1 1 1 1 1 1 1 1 1 1 1 ,
&A &A &D &D &D &D &G &M &M &M &M });
niter = 100;
summary = j(niter,2);
mattrib summary [c = {"Max Points", "Avg Points"} l=""];
do i = 1 to niter;
call garegen(id);
call gagetval(value, id);
summary[i,1] = value[1];
summary[i,2] = value[:];
end;
call gagetmem(mem, value, id, 1);
Players=Player_A[mem[1:2]]//Player_D[mem[3:6]]//Player_G[mem[7]]//Player_M[mem[8:11]];
sum_cost=sum(Pice_A[mem[1:2]],Pice_D[mem[3:6]],Pice_G[mem[7]],Pice_M[mem[8:11]]);
print Players[ l="best member:"],
"Max Points: " value[l = ""],
"Total Cost: " sum_cost[l = ""] ;
iteration = t(1:niter);
print iteration summary;
call gaend(id);
quit;
Xia Keshan
Yes. GA is a way to approximately solve some optimization problems. It can sometimes be used as a heuristic in conjunction with an optimziation solver.
Are you sure your code is correct?
If I run with createData(n=2), then I get as a result:
best member:
7
8
3
70
70
69
9
74
73
73
73
Which uses 70 twice and 73 three times, which is invalid. Also, the final result is wrong, it says max points = 68, but the actual max points for n=2 is 66.
Ha. I know that , there is a little glitch . This ought to be right . GA is so powerful , You don't need OR any more . Just kidding .
data have;
input Player Position $ Points Price ;
cards;
1 G 2 8000
2 D 4 7000
3 M 6 12000
4 A 8 10000
5 G 4 9000
6 A 9 14000
8 D 4 8000
9 M 6 4000
10 A 8 10000
12 A 9 14000
13 D 4 7000
14 M 6 12000
17 A 9 14000
18 G 2 8000
19 D 4 7000
20 M 6 12000
24 D 4 7000
25 M 6 12000
29 D 4 7000
30 M 6 7000
32 M 6 1000
36 D 4 7000
37 M 6 12000
42 D 4 7000
43 M 6 12000
45 G 4 9000
46 A 9 14000
47 D 4 7000
48 M 6 12000
55 D 4 7000
56 M 6 12000
60 D 4 6000
61 M 6 9000
63 M 6 2000
67 D 4 5000
68 M 6 9000
70 M 8 6000
;
run;
%macro createData(n=2);
data have(drop=i rename=(count=player));
set have(drop=player);
do i = 1 to &n;
count+1;
output;
end;
run;
%mend createData;
%createData()
options fullstimer;
proc sort data=have;by Position Points ;run;
proc sql noprint;
select count(*) into : A from have where Position='A';
select count(*) into : D from have where Position='D';
select count(*) into : G from have where Position='G';
select count(*) into : M from have where Position='M';
quit;
proc iml;
use have;
read all var _all_ where(Position='A');
Player_A=Player;Point_A=Points;Pice_A=Price;
read all var _all_ where(Position='D');
Player_D=Player;Point_D=Points;Pice_D=Price;
read all var _all_ where(Position='G');
Player_G=Player;Point_G=Points;Pice_G=Price;
read all var _all_ where(Position='M');
Player_M=Player;Point_M=Points;Pice_M=Price;
close have;
start football(x) global(Point_A,Pice_A,Point_D,Pice_D,Point_G,Pice_G,Point_M,Pice_M);
points=sum(Point_A[x[1:2]],Point_D[x[3:6]],Point_G[x[7]],Point_M[x[8:11]]);
sum_cost=sum(Pice_A[x[1:2]],Pice_D[x[3:6]],Pice_G[x[7]],Pice_M[x[8:11]]);
if sum_cost>75000 | ncol(unique(x[8:11]))^=4 | ncol(unique(x[3:6]))^=4 | ncol(unique(x[1:2]))^=2 then points=1;
return (points);
finish;
id=gasetup(2,11,123);
call gasetobj(id,1,"football");
call gasetcro(id,1.0,2);
call gasetmut(id,0.2,2,1);
call gasetsel(id,100,1,0.95);
call gainit(id,10000,{1 1 1 1 1 1 1 1 1 1 1 ,
&A &A &D &D &D &D &G &M &M &M &M });
niter = 100;
summary = j(niter,2);
mattrib summary [c = {"Max Points", "Avg Points"} l=""];
do i = 1 to niter;
call garegen(id);
call gagetval(value, id);
summary[i,1] = value[1];
summary[i,2] = value[:];
end;
call gagetmem(mem, value, id, 1);
Players=Player_A[mem[1:2]]//Player_D[mem[3:6]]//Player_G[mem[7]]//Player_M[mem[8:11]];
sum_cost=sum(Pice_A[mem[1:2]],Pice_D[mem[3:6]],Pice_G[mem[7]],Pice_M[mem[8:11]]);
print Players[ l="best member:"],
"Max Points: " value[l = ""],
"Total Cost: " sum_cost[l = ""] ;
iteration = t(1:niter);
print iteration summary;
call gaend(id);
quit;
Xia Keshan
Message was edited by: xia keshan
Message was edited by: xia keshan
Message was edited by: xia keshan
Matt,
The funny thing is when the problem become some more complicated (add more obs and have more different Points) , I found OR is a hell of a lot slower than GA .
Xia Keshan
Yes - that is expected, that a GA approach could be faster than an optimization solver, since GA is only a heuristic, provides no bound on the quality of the solution, and can provide solutions arbitrarily far from optimal.
That is, how do you know if your GA solution is good? Is it Optimal? Is it close to Optimal?
I know. But it is depend on the complex of problem, if number of obs is not too big , I believe I can get that optimal solution .
And the most important thing is GA can almost solve all sort of optimum problem . That is the point .
Xia Keshan
I think you are missing the point. How do you know you are getting the optimal? How do you know you are even getting close to optimal?
You also need to be careful about the definition of "solve".
Also, I think your statement that "GA can almost solve all sort of optimum problem" is quite naive.
"GA can almost solve all sort of optimum problem" is not from me . is from documentation of GA , I just quote it here .
Can you please point me to the documentation that has this statement? Thank you.
Hi Matt,
I might misunderstood the meaning of documentation . Said what you will , I still think it is a incredible tool .
Genetic algorithms (referred to hereafter as GAs) are a family of search algorithms that seek optimal solutions
to problems using the principles of natural selection and evolution. GAs can be applied to almost any
optimization problem and are especially useful for problems where other calculus-based techniques do not
work, such as when the objective function has many local optimums, it is not differentiable or continuous, or
solution elements are constrained to be integers or sequences. In most cases GAs require more computation
than specialized techniques that take advantage of specific problem structure or characteristics. However, for
optimization problems with no such techniques available, GAs provide a robust general method of solution.
Yes. You misunderstood the meaning. You have to be careful about what you mean when saying you "solve an optimization problem". GA is a great tool and very useful in certain cases to approximately solve some optimization problems. But, it is no replacement for an optimization solver, if you care about the quality of the solution.
In some cases, we have used GA and optimziation together. The GA is used to find an initial feasible solution, which is then used as a starting point (primalin) for the integer programming solver.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.