Operations Research topics: SAS/OR,
SAS Optimization, and SAS Simulation Studio

An interesting OR problem

Accepted Solution Solved
Reply
Super User
Posts: 10,214
Accepted Solution

An interesting OR problem

Here is an very interesting operation research problem .

http://stackoverflow.com/questions/35863766/find-the-minimal-list-from-a-many-to-many-mapping

 

I want know how to use SAS/OR to solve this.

I make it a little bit more complicated.

 

all = {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20} ;

A 1,3,5,7,9,19,12
B 2,4,6,8,10,16
C 3,6,8,12,17,1,10
D 5,13,20,15,16
E 4,9,15,19,5,14
F 11,2,4,15,1,14,10
G 20,13,15,4,5,6,9
H 19,18,17,11,15,11
I 2,14,6,16,3,11,7
J 1,12,3,13,4,14,18
K 7,17,8,18,9,19,20,4

 

Here is my GA solution.Don't cheat . 

 

 

 

data x;
input name $ v  $40.;
cards;
A 1,3,5,7,9,19,12
B 2,4,6,8,10,16
C 3,6,8,12,17,1,10
D 5,13,20,15,16
E 4,9,15,19,5,14
F 11,2,4,15,1,14,10
G 20,13,15,4,5,6,9
H 19,18,17,11,15,11
I 2,14,6,16,3,11,7
J 1,12,3,13,4,14,18
K 7,17,8,18,9,19,20,4
;
run;
data have;
 set x;
 do i=1 to countw(v,',');
  value=input(scan(v,i,','),best32.);
  output;
 end;
 drop i v;
run;
data key;
 set x(keep=name rename=(name=l));
run;

proc iml;

all = {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20} ;


use have nobs nobs;
read all var {name value};
close have;
use key;
read all var {l};
close key;

group_n=t(loc(name^=t(remove(name,1)||{' '})))-
        t(loc(name^=t({' '}||remove(name,nobs)))) +1;
all_n=countunique(all);
n=countunique(name);
encoding=repeat({0,1},1,n);

start func(x) global(value,group_n,all_n); 
 temp=setdif(t(repeat(x,group_n))#value,{0});
 if countunique(temp)=all_n then sum=sum(x);
  else sum=99999;
 return (sum);
finish;

id=gasetup(2,n,1234);
call gasetobj(id,0,"func");
call gasetcro(id,0.95,2);
call gasetmut(id,0.95,3);
call gasetsel(id,100,1,1);
call gainit(id,1000,encoding); 
niter = 100000;
summary = j(niter,2);
mattrib summary [c = {"MinCount", "Avg Count"} l=""];
do i = 1 to niter;
 call garegen(id);
 call gagetval(v, id);
 summary[i,1] = v[1];
 summary[i,2] = v[:];
end;
call gagetmem(mem, v, id, 1);

Memebers=l[loc(mem)];
print "Best Members:" Memebers[l=""],
      "Min Count:   " v[l = ""] ;
call gaend(id);
quit;

 

 

OUTPUT:


Best Members: C G I K
Min Count: 4


Accepted Solutions
Solution
‎03-12-2016 08:21 PM
SAS Employee
Posts: 505

Re: An interesting OR problem

proc optmodel;
   set <str,num> NAMES_VALUES;
   read data have into NAMES_VALUES=[name value];
   set NAMES  = setof {<i,v> in NAMES_VALUES} i;
   set VALUES = setof {<i,v> in NAMES_VALUES} v;

   var Select {NAMES} binary;
   min NumSelected = sum {i in NAMES} Select[i];
   con Cover {v in VALUES}:
      sum {<i,(v)> in NAMES_VALUES} Select[i] >= 1;

   solve;
   print {i in NAMES: Select[i].sol > 0.5} Select;
quit;

[1] Select
C 1
D 1
F 1
K 1

 

This is known as a "set covering problem" in the literature.

View solution in original post


All Replies
Solution
‎03-12-2016 08:21 PM
SAS Employee
Posts: 505

Re: An interesting OR problem

proc optmodel;
   set <str,num> NAMES_VALUES;
   read data have into NAMES_VALUES=[name value];
   set NAMES  = setof {<i,v> in NAMES_VALUES} i;
   set VALUES = setof {<i,v> in NAMES_VALUES} v;

   var Select {NAMES} binary;
   min NumSelected = sum {i in NAMES} Select[i];
   con Cover {v in VALUES}:
      sum {<i,(v)> in NAMES_VALUES} Select[i] >= 1;

   solve;
   print {i in NAMES: Select[i].sol > 0.5} Select;
quit;

[1] Select
C 1
D 1
F 1
K 1

 

This is known as a "set covering problem" in the literature.

Respected Advisor
Posts: 5,055

Re: An interesting OR problem

Did you modify @Ksharp's input (to correct the anomaly in set H) ? I get C-G-H-I with SAS/STAT 13.1 .

PG
SAS Employee
Posts: 505

Re: An interesting OR problem

[ Edited ]

No, I just used the "have" data set as is.  The READ DATA statement automatically removed the duplicate:

 WARNING: Duplicate key <H,11> was read at observation 51.

 

By the way, you can use the CLP solver to find all optimal solutions:

 

   solve with CLP / findallsolns;
   for {s in 1.._NSOL_} put ({i in NAMES: Select[i].sol[s] > 0.5});

It turns out that there are four in this case:

{'C','G','H','I'}
{'C','D','I','K'}
{'C','G','I','K'}
{'C','D','F','K'}

 

How did you solve it with SAS/STAT?

Respected Advisor
Posts: 5,055

Re: An interesting OR problem

I used SAS/OR 9.1, sorry. Yet, it didn't pick the same answer as yours. Is there something non-deterninistic in the algorithm?

PG
Valued Guide
Posts: 505

Re: An interesting OR problem

Hi Team

 

 I will be posting more detail on the SAS-L list, But it has 15 lines in a single data step(after removing checking arrays)? 

 

There appear to be 4 solutions, kind of a lark so be suspecious. I learn more from my errors then my solutions.

 

see below

 

Here is one of them

 

3=c
4=d
6=f
11=k

 

Others are below.

 

The solution may not scale  over 54 codes without using extended precision, no limit of number of vectors.

 

The only change from a previous post on SAS-L was to substitute 'bor' for 'sum'


Obs TOT E1 E2 E3 E4 D1 D2 D3 D4

1 2097150 136522 1155104 52246 1966992             3 4 6 11
2 2097150 136522 1155104 84172 1966992             3 4 9 11
3 2097150 136522 1090160 952320 84172                 3 7 8 9
4 2097150 136522 1090160 84172 1966992              3 7 9 11

 

data x;
cost= 2**1+2**3+2**5+2**7+2**9+2**19+2**12 ; output;
cost= 2**2+2**4+2**6+2**8+2**10+2**16 ; output;
cost= 2**3+2**6+2**8+2**12+2**17+2**1+2**10 ; output;
cost= 2**5+2**13+2**20+2**15+2**16 ; output;
cost= 2**4+2**9+2**15+2**19+2**5+2**14 ; output;
cost= 2**11+2**2+2**4+2**15+2**1+2**14+2**10 ; output;
cost= 2**20+2**13+2**15+2**4+2**5+2**6+2**9 ; output;
cost= 2**19+2**18+2**17+2**11+2**15 ; output;
cost= 2**2+2**14+2**6+2**16+2**3+2**11+2**7 ; output;
cost= 2**1+2**12+2**3+2**13+2**4+2**14+2**18 ; output;
cost= 2**7+2**17+2**8+2**18+2**9+2**19+2**20+2**4; output;
run;

all = {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20} ;

%macro ngetk(n=11);
/* generate combinations of n items taken k at a time */
data Comb(keep=tot res d: e: where=(tot=2097150));
retain tot 0 res e1-e11 d1-d11;
array chksum[11] _temporary_ (529066,66900,136522,1155104,574000,52246,1090160,952320,84172,290842,1966992);
array d[11] d1-d11 (11*0);
array e[11] e1-e11 (11*0);
%do k=1 %to 11;
array c&k.f[&k] (&k.*0); /* initialize array to 0 */
ncomb = comb(&n, &k); /* number of combinations */
do j = 1 to ncomb;
rc = lexcombi(&n, &k, of c&k.f[*]);
do l=1 to &k;
tot=bor(tot, chksum[c&k.f[l]]);
d[l]=c&k.f[l];
e[l]=chksum[c&k.f[l]];
end;
res=put(tot,hex8.);
output;
tot=0;
end;
%end;
run;
%mend ngetk;

%ngetk;

 

 

Super User
Posts: 10,214

Re: An interesting OR problem

Posted in reply to rogerjdeangelis

Roger,

I noticed you used lexcombi(). That means you can't handle large scale problem(i.e. have lots and lots of student) .

SAS Employee
Posts: 505

Re: An interesting OR problem

The algorithm is deterministic in the sense that if you run it twice with the same settings on the same machine, you will get the same solution.  But, in each new release of SAS/OR, changes in the underlying simplex algorithm, presolver, heuristics, cutting planes, branching strategies, etc. can alter which solution gets returned.

Respected Advisor
Posts: 5,055

Re: An interesting OR problem

I guess this means that the solution returned might depend on the number of threads involved in the search.

PG
SAS Employee
Posts: 505

Re: An interesting OR problem

Yes, the solution can depend on the number of threads used.

Super User
Posts: 10,214

Re: An interesting OR problem

Impressed. I want know how many students OR can handle ?

 

 

PG,

Yes. H is a typo . Sorry.

SAS Employee
Posts: 505

Re: An interesting OR problem

The number of students it can handle depends on the number of courses, how much memory you have, how long you're willing to wait, and how close to optimal you want to get.  Set covering is NP-hard, but in practice the MILP approach should definitely scale much better than brute-force.

Contributor
Posts: 22

Re: An interesting OR problem

Fantastic kskarp. Thanks a lot.


You make me dive deeply into the GA subject as I am having a hard time to understand your code.
But finally I've come behind it and I would like to share my insights for others who might be struggling to understand this code.

 

As the link to the problem description is broken, I start with my auto-explanation of what is the task here.
Find a minimum set of persons (variable:names) so that their attributes (here numbers from 1 to 20, vector:value) represent the range of the population (vector:all).

 

your code in black

group_n=t(loc(name^=t(remove(name,1)||{' '})))-
t(loc(name^=t({' '}||remove(name,nobs)))) +1;  /*Create a count of values for each person. The "data have" step earlier ensures that it gets sorted by person. It cost me a hell to figure out what it does. It's a nice trick but for me it's easier to understand to get this via this loop:

u = unique(name); /* 2. Unique values (levels) of categorical variable. */
s = j(ncol(u),1); /* 3. Allocate vector to hold results */
do i = 1 to ncol(u); /* 4. For each level... */
idx = (name=u[i]); /* 5. indicator matrix */
s[i] = t(idx)*idx;/* 6. count */
end;

 

encoding=repeat({0,1},1,n); /*The encoding has ones and zeros and its size is equal to number of unique persons. One selects that person meanwhile 0 leaves it out from the selection.

 

For the sake of understanding I created an random selection vector and played around:

call randseed(123);         /* set random number seed */

u = j(1,nrow(group_n),1);                /* allocate */

call randgen(u, "BERNOULLI",0.1); 

 

repeat(u,group_n);  /* Create ones and zeros as of the size of the vector of attributes. First element A=1 second element A=3...
This puts zeros (or ones) to all person's elements.

 

t(repeat(x,group_n))#value /* The 1/0 vector gets multiplied elementwise with the attributes values vector prior transpose.

 

temp=setdif(t(repeat(x,group_n))#value,{0}); /*The setdif function cancels out zeros and leaves only unique entries. If 10 as value appears several times, then the setdif resulting vector has only 1 times 10.

 

if countunique(temp)=all_n then sum=sum(x); /*If and only if all values are present in the current selection then count the sum of persons chosen.
else sum=99999;  /*As it's a minimization problem, else set sum to a very high value to penalize.

Super User
Posts: 10,214

Re: An interesting OR problem

Congratulations!

Glad you could understand my GA code.

Enjoy Genetic Algorithm, you can concur any of optimal problem via it.

I know @RobPratt might don't think so . But I love GA .

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 13 replies
  • 1184 views
  • 9 likes
  • 5 in conversation