BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Matthew_Galati
SAS Employee

Xia - The solver code in SAS/OR is not a data step. It is written in C.

The algorithm used in SAS/OR is a variant of the branch-and-cut algorithm for integer programming. You can see some high level information about the algorithm here: SAS/OR(R) 13.2 User's Guide: Mathematical Programming. Or, there are many textbooks on the topic - see, for example: Integer and Combinatorial Optimization: Laurence A. Wolsey, George L. Nemhauser: 9780471359432: Amaz...

Ksharp
Super User

Hi Matt,

Thanks for sharing that book . Hope one day I could buy one and have time to read about it .

I ended up with a better and faster BACKWARDS algorithm (although it is not best algorithm).

For this tiny example, on my machine,

  • original code ran in: 9 seconds
  • this one ran in 0.03 seconds

 
 
data have;
input Player Position $  Points Price     ;
cards;
1 G 2 8000
2 D 4 7000
3 M 6 12000
4 A 8 10000
5 G 4 9000
6 A 9 14000
8 D 4 8000
9 M 6 4000
10 A 8 10000
12 A 9 14000
13 D 4 7000
14 M 6 12000
17 A 9 14000
18 G 2 8000
19 D 4 7000
20 M 6 12000
24 D 4 7000
25 M 6 12000
29 D 4 7000
30 M 6 7000
32 M 6 1000
36 D 4 7000
37 M 6 12000
42 D 4 7000
43 M 6 12000
45 G 4 9000
46 A 9 14000
47 D 4 7000
48 M 6 12000
55 D 4 7000
56 M 6 12000
60 D 4 6000
61 M 6 9000
63 M 6 2000
67 D 4 5000
68 M 6 9000
70 M 8 6000
;
run;


proc sort data=have ;by position descending Points Price;run;
data _null_;
 set have;
 by position ;
 if last.position then call symputx(position,_n_);
run;
data _null_;
 set have end=last;
 length list $ 100 point_sum point_max cost _cost 8 ;
 array pla{&M}  _temporary_ ;
 array pos{&M} $  _temporary_ ;
 array poi{&M}  _temporary_ ;
 array pri{&M}  _temporary_ ;
 pla{_n_}=Player;
 pos{_n_}=Position;
 poi{_n_}=Points;
 pri{_n_}=Price;

if last then do; 
offset=-1;
do until(offset=max(&A-2,&D-&A-4,&G-&D-1,&M-&G-4));
offset+1;
do i1=1 to min(2+offset,&A) ;
do i2=i1+1 to  min(2+offset,&A) ;
  do j1=%eval(&A+1) to  min(&A+4+offset,&D) ;
  do j2=j1+1 to min(&A+4+offset,&D);
  do j3=j2+1 to min(&A+4+offset,&D) ;
  do j4=j3+1 to min(&A+4+offset,&D) ;
    do m=%eval(&D+1) to min(&D+1+offset,&G) ;
      do n1=%eval(&G+1) to min(&G+4+offset,&M) ;
       do n2=n1+1 to min(&G+4+offset,&M) ;
       do n3=n2+1 to min(&G+4+offset,&M) ;
       do n4=n3+1 to min(&G+4+offset,&M) ;

          if i1=min(2+offset,&A) or i2=min(2+offset,&A)     or
              j1=min(&A+4+offset,&D) or j2=min(&A+4+offset,&D) or     j3=min(&A+4+offset,&D) or j4=min(&A+4+offset,&D) or
               m=min(&D+1+offset,&G) or
            n1=min(&G+4+offset,&M) or n2=min(&G+4+offset,&M) or n3=min(&G+4+offset,&M) or n4=min(&G+4+offset,&M) then do;

           point_sum=sum(poi{i1},poi{i2},poi{j1},poi{j2},poi{j3},poi{j4},poi{m},poi{n1},poi{n2},poi{n3},poi{n4});
           cost=sum(pri{i1},pri{i2},pri{j1},pri{j2},pri{j3},pri{j4},pri{m},pri{n1},pri{n2},pri{n3},pri{n4});

           if  point_sum gt point_max and cost le 75000 then do;
            point_max=point_sum; _cost=cost;
            list=catx('|',pla{i1},pla{i2},pla{j1},pla{j2},pla{j3},pla{j4},pla{m},pla{n1},pla{n2},pla{n3},pla{n4});
           end;
               if not missing(list) then do;
             putlog 'Players : ' list 'Max Points : ' point_max 'Cost : ' _cost; stop;
           end;

         end;

    end;
    end;
    end;
    end;
   end;
  end;
  end;
  end;
  end;
end;
end;
end;
putlog 'Not found solution.'; 
end;
run;








Players : 6|12|67|60|2|13|5|70|32|63|9 Max Points : 64 Cost : 75000

Xia Keshan

Message was edited by: xia keshan

Message was edited by: xia keshan

Matthew_Galati
SAS Employee

Good job.

Unfortunately, I think your approach might start to struggle if you change any of the rules associated with the model (for example, try to apply that to a different sport for Fantasy, or change a rule), or if you scale the problem up to more realistic sizes (since you are using "brute force").

As an example, let's consider a larger field of players (more realistic for Fantasy Sports applications). We can simulate that as such:

%macro createData(n=15);

data have(drop=i rename=(count=player));

retain count 0;

    set have;

do i = 1 to &n;

  count+1;

  playerNumber = player;

  output;

end;

run;

%mend createData;

%createData();

Running your code took 49 minutes.

Players : 76|77|511|512|513|514|61|541|542|301|302 Max Points : 66 Cost : 71000

NOTE: DATA statement used (Total process time):

      real time           49:02.72

      cpu time            49:02.69

And running my code took: 0.08s

NOTE: Optimal.

NOTE: Objective = 68.

NOTE: PROCEDURE OPTMODEL used (Total process time):

      real time           0.08 seconds

      cpu time            0.07 seconds

Also,  are you sure your code is exact? The OR solver found a better solution (68).

Ksharp
Super User

169  %macro createData(n=15);

170  data have(drop=i rename=(count=player));

171  retain count 0;

172      set have;

173  do i = 1 to &n;

174    count+1;

175    playerNumber = player;

176    output;

177  end;

178  run;

179  %mend createData;

180  %createData()

WARNING: Variable Player already exists on file WORK.HAVE.

Why I would got this message.

As I said . My algorithm is Backward , so if the solution is at high level , my code would be very fast, whereas if it is at low level ,mine would cost lots of time. But that is the start point problem.

"brute force" , I don't agree with that,  I am partially brute force . first four member then five member .....six ....   .If the solution is at the top ,that would be very fast.

Of courese. I have a long way to walk.


Yeah. You are right. I need to reconsider it for a while .

Message was edited by: xia keshan

Ksharp
Super User

Haha Matt,

I ended up with Genetic Algorithm . It is so sharp and powerful and almost be able to solve all kind of optimum problem.

Which means You can solve all the optimum problems without any money if you have SAS University Edition.Code is written in IML.

It is so amazing and so fast (0.4 sec). I love it too much.

data have;

input Player Position $  Points Price ;

cards;

1 G 2 8000

2 D 4 7000

3 M 6 12000

4 A 8 10000

5 G 4 9000

6 A 9 14000

8 D 4 8000

9 M 6 4000

10 A 8 10000

12 A 9 14000

13 D 4 7000

14 M 6 12000

17 A 9 14000

18 G 2 8000

19 D 4 7000

20 M 6 12000

24 D 4 7000

25 M 6 12000

29 D 4 7000

30 M 6 7000

32 M 6 1000

36 D 4 7000

37 M 6 12000

42 D 4 7000

43 M 6 12000

45 G 4 9000

46 A 9 14000

47 D 4 7000

48 M 6 12000

55 D 4 7000

56 M 6 12000

60 D 4 6000

61 M 6 9000

63 M 6 2000

67 D 4 5000

68 M 6 9000

70 M 8 6000

;

run;

%macro createData(n=15);

data have(drop=i rename=(count=player));

    set have(drop=player);

do i = 1 to &n;

  count+1;

  output;

end;

run;

%mend createData;

%createData()

options fullstimer;

proc sql noprint;

select count(*) into : A from have where Position='A';

select count(*) into : D from have where Position='D';

select count(*) into : G from have where Position='G';

select count(*) into : M from have where Position='M';

quit;

proc iml;

use have;

read all var _all_ where(Position='A');

Player_A=Player;Point_A=Points;Pice_A=Price;

read all var _all_ where(Position='D');

Player_D=Player;Point_D=Points;Pice_D=Price;

read all var _all_ where(Position='G');

Player_G=Player;Point_G=Points;Pice_G=Price;

read all var _all_ where(Position='M');

Player_M=Player;Point_M=Points;Pice_M=Price;

close have;

start football(x) global(Point_A,Pice_A,Point_D,Pice_D,Point_G,Pice_G,Point_M,Pice_M);

points=sum(Point_A[x[1:2]],Point_D[x[3:6]],Point_G[x[7]],Point_M[x[8:11]]);

sum_cost=sum(Pice_A[x[1:2]],Pice_D[x[3:6]],Pice_G[x[7]],Pice_M[x[8:11]]);

if sum_cost>75000 then points=1;

return (points);

finish;

start uniform_cross(child1, child2, parent1, parent2);

child1 = parent1;

child2 = parent2;

do i = 1 to ncol(parent1);

  r = uniform(1234);

  if r<=0.5 then do;

   child1 = parent2;

   child2 = parent1;

  end;

end;

finish;

id=gasetup(2,11,123);

call gasetobj(id,1,"football");

call gasetcro(id,1.0,0,"uniform_cross");

call gasetmut(id,0.2,1);

call gasetsel(id,3,1,0.95);

call gainit(id,2000,{1  1  1  1  1  1  1  1  1  1  1 ,

                    &A &A &D &D &D &D &G &M &M &M &M });

niter = 100;

summary = j(niter,2);

mattrib summary [c = {"Max Points", "Avg Points"} l=""];

do i = 1 to niter;

call garegen(id);

call gagetval(value, id);

summary[i,1] = value[1];

summary[i,2] = value[:];

end;

call gagetmem(mem, value, id, 1);

Players=Player_A[mem[1:2]]//Player_D[mem[3:6]]//Player_G[mem[7]]//Player_M[mem[8:11]];

sum_cost=sum(Pice_A[mem[1:2]],Pice_D[mem[3:6]],Pice_G[mem[7]],Pice_M[mem[8:11]]);

print Players[ l="best member:"],

      "Max Points: " value[l = ""],

      "Total Cost:  " sum_cost[l = ""] ;

iteration = t(1:niter);

print iteration summary;

call gaend(id);

quit;

Xia Keshan

Matthew_Galati
SAS Employee

Yes. GA is a way to approximately solve some optimization problems. It can sometimes be used as a heuristic in conjunction with an optimziation solver.

Are you sure your code is correct?

If I run with createData(n=2), then I get as a result:

best member:

7

8

3

70

70

69

9

74

73

73

73

Which uses 70 twice and 73 three times, which is invalid. Also, the final result is wrong, it says max points = 68, but the actual max points for n=2 is 66.

Ksharp
Super User

Ha. I know that , there is a little glitch . This ought to be right . GA is so powerful , You don't need OR any more . Just kidding . Smiley Happy

data have;

input Player Position $  Points Price ;

cards;

1 G 2 8000

2 D 4 7000

3 M 6 12000

4 A 8 10000

5 G 4 9000

6 A 9 14000

8 D 4 8000

9 M 6 4000

10 A 8 10000

12 A 9 14000

13 D 4 7000

14 M 6 12000

17 A 9 14000

18 G 2 8000

19 D 4 7000

20 M 6 12000

24 D 4 7000

25 M 6 12000

29 D 4 7000

30 M 6 7000

32 M 6 1000

36 D 4 7000

37 M 6 12000

42 D 4 7000

43 M 6 12000

45 G 4 9000

46 A 9 14000

47 D 4 7000

48 M 6 12000

55 D 4 7000

56 M 6 12000

60 D 4 6000

61 M 6 9000

63 M 6 2000

67 D 4 5000

68 M 6 9000

70 M 8 6000

;

run;

%macro createData(n=2);

data have(drop=i rename=(count=player));

    set have(drop=player);

do i = 1 to &n;

  count+1;

  output;

end;

run;

%mend createData;

%createData()

options fullstimer;

proc sort data=have;by Position Points ;run;

proc sql noprint;

select count(*) into : A from have where Position='A';

select count(*) into : D from have where Position='D';

select count(*) into : G from have where Position='G';

select count(*) into : M from have where Position='M';

quit;

proc iml;

use have;

read all var _all_ where(Position='A');

Player_A=Player;Point_A=Points;Pice_A=Price;

read all var _all_ where(Position='D');

Player_D=Player;Point_D=Points;Pice_D=Price;

read all var _all_ where(Position='G');

Player_G=Player;Point_G=Points;Pice_G=Price;

read all var _all_ where(Position='M');

Player_M=Player;Point_M=Points;Pice_M=Price;

close have;

start football(x) global(Point_A,Pice_A,Point_D,Pice_D,Point_G,Pice_G,Point_M,Pice_M);

points=sum(Point_A[x[1:2]],Point_D[x[3:6]],Point_G[x[7]],Point_M[x[8:11]]);

sum_cost=sum(Pice_A[x[1:2]],Pice_D[x[3:6]],Pice_G[x[7]],Pice_M[x[8:11]]);

if sum_cost>75000 | ncol(unique(x[8:11]))^=4  | ncol(unique(x[3:6]))^=4 | ncol(unique(x[1:2]))^=2  then points=1;

return (points);

finish;

id=gasetup(2,11,123);

call gasetobj(id,1,"football");

call gasetcro(id,1.0,2);

call gasetmut(id,0.2,2,1);

call gasetsel(id,100,1,0.95);

call gainit(id,10000,{1  1  1  1  1  1  1  1  1  1  1 ,

                    &A &A &D &D &D &D &G &M &M &M &M });

niter = 100;

summary = j(niter,2);

mattrib summary [c = {"Max Points", "Avg Points"} l=""];

do i = 1 to niter;

call garegen(id);

call gagetval(value, id);

summary[i,1] = value[1];

summary[i,2] = value[:];

end;

call gagetmem(mem, value, id, 1);

Players=Player_A[mem[1:2]]//Player_D[mem[3:6]]//Player_G[mem[7]]//Player_M[mem[8:11]];

sum_cost=sum(Pice_A[mem[1:2]],Pice_D[mem[3:6]],Pice_G[mem[7]],Pice_M[mem[8:11]]);

print Players[ l="best member:"],

      "Max Points: " value[l = ""],

      "Total Cost:  " sum_cost[l = ""] ;

iteration = t(1:niter);

print iteration summary;

call gaend(id);

quit;

Xia Keshan

Message was edited by: xia keshan

Message was edited by: xia keshan

Message was edited by: xia keshan

Ksharp
Super User

Matt,

The funny thing is when the problem become some more complicated (add more obs and have more different Points) , I found OR is a hell of a lot slower than GA .

Xia Keshan

Matthew_Galati
SAS Employee

Yes - that is expected, that a GA approach could be faster than an optimization solver, since GA is only a heuristic, provides no bound on the quality of the solution, and can provide solutions arbitrarily far from optimal.

That is, how do you know if your GA solution is good? Is it Optimal? Is it close to Optimal?

Ksharp
Super User

I know. But it is depend on the complex of problem, if number of obs is not too big , I believe I can get that optimal solution .

And the most important thing is GA can almost solve all sort of optimum problem .  That is the point .

Xia Keshan

Matthew_Galati
SAS Employee

I think you are missing the point. How do you know you are getting the optimal? How do you know you are even getting close to optimal?

You also need to be careful about the definition of "solve".

Also, I think your statement that "GA can almost solve all sort of optimum problem" is quite naive.

Ksharp
Super User

"GA can almost solve all sort of optimum problem" is not from me . is from documentation of  GA , I just quote it here .

Matthew_Galati
SAS Employee

Can you please point me to the documentation that has this statement? Thank you.

Ksharp
Super User

Hi Matt,

I might misunderstood the meaning of documentation . Said what you will , I still think it is a incredible tool .

Genetic algorithms (referred to hereafter as GAs) are a family of search algorithms that seek optimal solutions

to problems using the principles of natural selection and evolution. GAs can be applied to almost any

optimization problem and are especially useful for problems where other calculus-based techniques do not

work, such as when the objective function has many local optimums, it is not differentiable or continuous, or

solution elements are constrained to be integers or sequences. In most cases GAs require more computation

than specialized techniques that take advantage of specific problem structure or characteristics. However, for

optimization problems with no such techniques available, GAs provide a robust general method of solution.

Matthew_Galati
SAS Employee

Yes. You misunderstood the meaning. You have to be careful about what you mean when saying you "solve an optimization problem". GA is a great tool and very useful in certain cases to approximately solve some optimization problems. But, it is no replacement for an optimization solver, if you care about the quality of the solution.

In some cases, we have used GA and optimziation together. The GA is used to find an initial feasible solution, which is then used as a starting point (primalin) for the integer programming solver.

sas-innovate-2024.png

 

Secure your spot at the must-attend AI and analytics event of 2024: SAS Innovate 2024! Get ready for a jam-packed agenda featuring workshops, super demos, breakout sessions, roundtables, inspiring keynotes and incredible networking events.

 

Register by March 1 to snag the Early Bird rate of just $695! Don't miss out on this exclusive offer. 

 

Register now!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 32 replies
  • 4940 views
  • 3 likes
  • 8 in conversation