BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
cxkev182
Fluorite | Level 6

Good evening,

I have a dataset that contains the predicted points for each player in a fantasy football competition. The fields in the dataset are:

Player

Team

Position (G, D, M or A)

Points

Price

Is it possible to use a procedure that would enable me to pick the combination of players that would score maximum points while satisfying following conditions:

Number of G selected = 1

Number of D selected = 4

Number of M selected = 4

Number of A selected = 2

Total price of of players selected <= £75,000,000

Many thanks for any help you can give,

cxkev

1 ACCEPTED SOLUTION

Accepted Solutions
gergely_batho
SAS Employee

Hi cxkev,

I have a blog post, that is exactly about this problem. Unfortunately it is in Hungarian. Formulas are in English.

Kell egy csapat! – a feladat megoldása - Analitika anyanyelven

But you can download sample data and code (OPTMODEL) from here:

http://blogs.sas.com/content/analitika/files/2012/09/generate_data_sas.txt

http://blogs.sas.com/content/analitika/files/2012/09/solve_team_sas.txt

View solution in original post

32 REPLIES 32
Anotherdream
Quartz | Level 8

I'm sure there is a mathematical formula for this (that I am not aware of so take this with a grain of salt), but you could break it into a logic problem and avoid any 'formulas' all together.


By definition, if you had no price cap, you would simply pick the 1 G, 4 D, 4 M, and 2 A that scored the most amount of points.  however you have a price cap.

So what I would do is create a new variable that is the players Points / Cost.   Then I would select the Best players in every group (defined by points scored, regardless of price),  then figure out how much you are OVER in your spending budget.

Then rank your players by your "points / cost" variable. Replace the player with the worst points / cost from your seiection  with the next available player, and see if you are then under-budget.

Continue this process in a loop and once you are under budget you are "very close" to maximized in your points scored. There's a little more work from here however.

Lets say you are only 1 million over budget, and you replace your 3rd best D with the 5th best D, however their salaries are very far apart. Maybe now you are 8 million under-budget... Well you might be able to get a better "G" player, so you'd have to loop over all available players to see if the difference in salaries are small enough for you to add them back.

Again once this logic check fails you are maximized on points scored.

RobPratt
SAS Super FREQ

You can use PROC OPTMODEL to formulate the problem and solve it with the mixed integer linear programming (MILP) solver.  You need to introduce one binary variable per player, with the interpretation that the variable equals 1 if and only if that player is selected.  Each of the four conditions corresponds to a linear constraint over these variables.  And the objective function to be maximized is also a linear function of these variables.

gergely_batho
SAS Employee

Hi cxkev,

I have a blog post, that is exactly about this problem. Unfortunately it is in Hungarian. Formulas are in English.

Kell egy csapat! – a feladat megoldása - Analitika anyanyelven

But you can download sample data and code (OPTMODEL) from here:

http://blogs.sas.com/content/analitika/files/2012/09/generate_data_sas.txt

http://blogs.sas.com/content/analitika/files/2012/09/solve_team_sas.txt

Ksharp
Super User

You also don't need SAS/OR , just a simple data step is enough.

Xia Keshan

Matthew_Galati
SAS Employee

How exactly would you solve this with a simple data step?

Ksharp
Super User

Sure . The condition is set Total price of of players selected <= £75,000  .

data have;
input Player Position $  Points Price     ;
cards;
1 G 2 8000
2 D 4 7000
3 M 6 12000
4 A 8 10000
5 G 4 9000
6 A 9 14000
8 D 4 8000
9 M 6 4000
10 A 8 10000
12 A 9 14000
13 D 4 7000
14 M 6 12000
17 A 9 14000
18 G 2 8000
19 D 4 7000
20 M 6 12000
24 D 4 7000
25 M 6 12000
29 D 4 7000
30 M 6 7000
32 M 6 1000
36 D 4 7000
37 M 6 12000
42 D 4 7000
43 M 6 12000
45 G 4 9000
46 A 9 14000
47 D 4 7000
48 M 6 12000
55 D 4 7000
56 M 6 12000
60 D 4 6000
61 M 6 9000
63 M 6 2000
67 D 4 5000
68 M 6 9000
70 M 8 6000
;
run;

%let dsid=%sysfunc(open(have));
%let nobs=%sysfunc(attrn(&dsid,nlobs));
%let dsid=%sysfunc(close(&dsid));
proc sort data=have ;by position;run;
data _null_;
 set have;
 by position;
 if last.position then call symputx(position,_n_);
run;
data _null_;
 set have end=last;
 length list $ 100 point_sum point_max cost _cost 8 ;
 array pla{&nobs}  _temporary_ ;
 array pos{&nobs} $  _temporary_ ;
 array poi{&nobs}  _temporary_ ;
 array pri{&nobs}  _temporary_ ;
 pla{_n_}=Player;
 pos{_n_}=Position;
 poi{_n_}=Points;
 pri{_n_}=Price;

if last then do; 
do i1=1 to &A ;
do i2=i1+1 to &A ;
  do j1=%eval(&A+1) to &D ;
  do j2=j1+1 to &D ;
  do j3=j2+1 to &D ;
  do j4=j3+1 to &D ;
    do m=%eval(&D+1) to &G ;
      do n1=%eval(&G+1) to &M ;
       do n2=n1+1 to &M ;
       do n3=n2+1 to &M ;
       do n4=n3+1 to &M ;
        point_sum=sum(poi{i1},poi{i2},poi{j1},poi{j2},poi{j3},poi{j4},poi{m},poi{n1},poi{n2},poi{n3},poi{n4});
         cost=sum(pri{i1},pri{i2},pri{j1},pri{j2},pri{j3},pri{j4},pri{m},pri{n1},pri{n2},pri{n3},pri{n4});

        if  point_sum gt point_max and cost le 75000 then do;
          point_max=point_sum; _cost=cost;
          list=catx('|',pla{i1},pla{i2},pla{j1},pla{j2},pla{j3},pla{j4},pla{m},pla{n1},pla{n2},pla{n3},pla{n4});
       end;
   
    end;
    end;
    end;
    end;
   end;
  end;
  end;
  end;
  end;
end;
end;


putlog 'Players : ' list 'Max Points : ' point_max 'Cost : ' _cost;
end;
run;

Players : 6|12|2|13|60|67|5|9|32|63|70 Max Points : 64 Cost : 75000

Xia Keshan

Message was edited by: xia keshan

Haikuo
Onyx | Level 15

, This is hardly can be called a 'simple data step', maybe only for you though Smiley Wink.   nonetheless, you have the job done nicely. However, you probably need additional step to obtain all of the qualified combinations, or you sure can go extra miles within single data step by setting up Hash or multi-dim array to retain all of the qualified combinations, and in your mock data there are 432 combinations meeting the same criteria.

Here is my dumber approach,

proc sql noprint;

select player into :gp separated by ' ' from have where Position='G' ;

select count(player) into :gc separated by ' ' from have where Position='G' ;

select player into :dp separated by ' ' from have where Position='D' ;

select count(player) into :dc separated by ' ' from have where Position='D' ;

select player into :mp separated by ' ' from have where Position='M' ;

select count(player) into :mc separated by ' ' from have where Position='M' ;

select player into :ap separated by ' ' from have where Position='A' ;

select count(player) into :ac separated by ' ' from have where Position='A' ;

run;

data _gp;

array gp[&gc.]  (&gp.);

n=dim(gp);

k=1;

ncomb=comb(n, k);

do j=1 to ncomb;

      call allcomb(j, k, of gp

  • );
  •       output;

    end;

    keep gp1;

    run;

    data _dp;

    array dp[&dc.]  (&dp.);

    n=dim(dp);

    k=4;

    ncomb=comb(n, k);

    do j=1 to ncomb;

          call allcomb(j, k, of dp

  • );
  •       output;

    end;

    keep dp1-dp4;

    run;

    data _mp;

    array mp[&mc.]  (&mp.);

    n=dim(mp);

    k=4;

    ncomb=comb(n, k);

    do j=1 to ncomb;

          call allcomb(j, k, of mp

  • );
  •       output;

    end;

    keep mp1-mp4;

    run;

    data _ap;

    array ap[&ac.]  (&ap.);

    n=dim(ap);

    k=2;

    ncomb=comb(n, k);

    do j=1 to ncomb;

          call allcomb(j, k, of ap

  • );
  •       output;

    end;

    keep ap1-ap2;

    run;

    proc sql;

    create table comall as

    select * from _gp, _dp, _mp, _ap;

    quit;

    data h1;

    if _n_=1 then do;

    if 0 then set have;

    declare hash h(dataset:'have');

    h.definekey('player');

    h.definedata(all:'y');

    h.definedone();

    end;

    set comall;

    array p gp1--ap2;

    call missing (_price, _points);

    do over p;

    rc=h.find(key:p);

    _price+price;

    _points+points;

    end;

    if _price <=75000;

    run;

    proc sql;

    create table want(keep = gp1 dp1-dp4 mp1-mp4 ap1 ap2) as

    select * from h1  having _points=max(_points);

    quit;

    Haikuo

    Matthew_Galati
    SAS Employee

    Both data step / sql approaches are using brute force enumeration and hard coded sizes for the player positions (number of players needed). What happens if the number of players needed would change? What if you tried to scale this up? How long does the data step take to run if you had a field of 500 players - which is quite common in Fantasy sports applications?


    For this tiny example, on my machine,

    • Xia's code ran in: 7 seconds
    • Haikuo's code ran in: ~2 minutes
    • My code below (adapted from previous post) ran in: 0.03 seconds

    data rules;

    input position $ numPlayersPerPos;

    datalines;

    G 1

    D 4

    M 4

    A 2

    ;

    run;

    proc optmodel;

    set PLAYERS;

    num price{PLAYERS};

    num points{PLAYERS};

    str position{PLAYERS};

    set<str> POSITIONS;

    num numPlayersPerPos{POSITIONS};

    var assign{PLAYERS} binary;

    max sumPoints=sum{p in PLAYERS}points

    *assign

    ;

    con Budget: sum{p in PLAYERS}price

    *assign

    <=75000;

    con NumPos{po in POSITIONS}:sum{p in PLAYERS:position

    =po}assign

    =numPlayersPerPos[po];

            read data have into PLAYERS=[Player] position price points;

    read data rules into POSITIONS=[position] numPlayersPerPos;

    solve;

    create data team from [player=p]={p in PLAYERS:assign

    >0.9};

    quit;

    Haikuo
    Onyx | Level 15

    Point taken and there is no surprise that specialized SAS Proc will win Smiley Happy.  But what if you don't have SAS/OR? SAS/OR is not something like SAS/STAT which most of the customers would want to have for just being on the safe side. For those only need this kind of functionality sparsely, data step or proc sql will still stand as a viable solution.

    Matthew_Galati
    SAS Employee

    Everyone I know wants to have SAS/OR. Smiley Happy

    Haikuo
    Onyx | Level 15

    True. I want it, but sadly we don't have it, even as a company with more than 400 SAS users.Smiley Sad

    LeoLopes
    SAS Employee

    (Disclaimer: I write models for SAS/OR customers.)

    I'd venture to say that for any sufficiently complex business, the cost of not having SAS/OR is greater than the cost of the license, and that the difference between those two costs only increases as advances in predictive analytics and data management increase analytical maturity.

    Beyond computing time, think of analyst time. That is what is most expensive. Compare the complexity of the solutions on this page. If the opportunity to automate a decision process is ignored because implementing a solution using imperative languages is too complicated, or is too difficult to adapt to changing underlying circumstances, then that business process remains manual, errors and inefficiencies in that business process remain undetected, and all the interactions between that process and the other processes in the firm suffer.

    In contrast, when it is easy to describe a decision process using declarative constructs that are close to the business rules themselves, and easy to use an optimization engine to automate the computation of the solution, then more processes are automated, and the automation of each process enables gains that start from that very specific operation but then emanate throughout the firm.

    Those network effects are benefits beyond the traditional, more immediate benefit of running the process itself in a more efficient manner.

    Haikuo
    Onyx | Level 15

    While I am heartily embracing your sentiment, your general comments can also apply to SAS/IML, SAS/ETS, SAS/QC, and even SAS/AF. And if you want to go beyond that, they can also apply to E-miner, and on top of that, Text-miner, and one step further,  SAS Sentiment Analysis, and we haven't touched many other great SAS solution servers. By the end of the day, it is all about cost of ownership vs potential risks, and where to set the line most time it is hard to calculate therefore rather subjective. While I am no where close to the pay grade to decide which product to buy, I can still see how it goes. When business feel confused/undecided, that is when they step on the brakes, and sadly to say, that is also when they resort to third party solutions, such as R.  

    Ksharp
    Super User

    "What happens if the number of players needed would change? What if you tried to scale this up?"

    I can make a macro like SAS/OR did , that is not a big deal .

    "What happens if the number of players needed would change? What if you tried to scale this up?"

    That is depended on which algorithm  you are using . I would like to know a better and faster algorithm for this question. It would be generous if you could display what exact algorithm SAS/OR is using by SAS data step or just explain it in English .  Maybe I could end up a better and faster algorithm in near future. 

    As Bian said , SAS/OR need money to get it , and I also believe SAS/OR have the best algorithm in the world to process such PROGRAMMING . But not every company would like to pay for it . So data step or some other kind skill is an alternative way , although it would cost lots of time more than SAS/OR .But that is algorithm problem .

    Xia Keshan

    sas-innovate-2024.png

    Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

    Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

     

    Register now!

    Multiple Linear Regression in SAS

    Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

    Find more tutorials on the SAS Users YouTube channel.

    Discussion stats
    • 32 replies
    • 5005 views
    • 3 likes
    • 8 in conversation