Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 03-12-2016 12:51 AM
(2630 views)

Here is an very interesting operation research problem .

http://stackoverflow.com/questions/35863766/find-the-minimal-list-from-a-many-to-many-mapping

I want know how to use SAS/OR to solve this.

I make it a little bit more complicated.

all = {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20} ;

A 1,3,5,7,9,19,12

B 2,4,6,8,10,16

C 3,6,8,12,17,1,10

D 5,13,20,15,16

E 4,9,15,19,5,14

F 11,2,4,15,1,14,10

G 20,13,15,4,5,6,9

H 19,18,17,11,15,11

I 2,14,6,16,3,11,7

J 1,12,3,13,4,14,18

K 7,17,8,18,9,19,20,4

Here is my GA solution.Don't cheat .

```
data x;
input name $ v $40.;
cards;
A 1,3,5,7,9,19,12
B 2,4,6,8,10,16
C 3,6,8,12,17,1,10
D 5,13,20,15,16
E 4,9,15,19,5,14
F 11,2,4,15,1,14,10
G 20,13,15,4,5,6,9
H 19,18,17,11,15,11
I 2,14,6,16,3,11,7
J 1,12,3,13,4,14,18
K 7,17,8,18,9,19,20,4
;
run;
data have;
set x;
do i=1 to countw(v,',');
value=input(scan(v,i,','),best32.);
output;
end;
drop i v;
run;
data key;
set x(keep=name rename=(name=l));
run;
proc iml;
all = {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20} ;
use have nobs nobs;
read all var {name value};
close have;
use key;
read all var {l};
close key;
group_n=t(loc(name^=t(remove(name,1)||{' '})))-
t(loc(name^=t({' '}||remove(name,nobs)))) +1;
all_n=countunique(all);
n=countunique(name);
encoding=repeat({0,1},1,n);
start func(x) global(value,group_n,all_n);
temp=setdif(t(repeat(x,group_n))#value,{0});
if countunique(temp)=all_n then sum=sum(x);
else sum=99999;
return (sum);
finish;
id=gasetup(2,n,1234);
call gasetobj(id,0,"func");
call gasetcro(id,0.95,2);
call gasetmut(id,0.95,3);
call gasetsel(id,100,1,1);
call gainit(id,1000,encoding);
niter = 100000;
summary = j(niter,2);
mattrib summary [c = {"MinCount", "Avg Count"} l=""];
do i = 1 to niter;
call garegen(id);
call gagetval(v, id);
summary[i,1] = v[1];
summary[i,2] = v[:];
end;
call gagetmem(mem, v, id, 1);
Memebers=l[loc(mem)];
print "Best Members:" Memebers[l=""],
"Min Count: " v[l = ""] ;
call gaend(id);
quit;
```

OUTPUT:

Best Members: C G I K

Min Count: 4

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

```
proc optmodel;
set <str,num> NAMES_VALUES;
read data have into NAMES_VALUES=[name value];
set NAMES = setof {<i,v> in NAMES_VALUES} i;
set VALUES = setof {<i,v> in NAMES_VALUES} v;
var Select {NAMES} binary;
min NumSelected = sum {i in NAMES} Select[i];
con Cover {v in VALUES}:
sum {<i,(v)> in NAMES_VALUES} Select[i] >= 1;
solve;
print {i in NAMES: Select[i].sol > 0.5} Select;
quit;
```

[1] Select

C 1

D 1

F 1

K 1

This is known as a "set covering problem" in the literature.

13 REPLIES 13

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

```
proc optmodel;
set <str,num> NAMES_VALUES;
read data have into NAMES_VALUES=[name value];
set NAMES = setof {<i,v> in NAMES_VALUES} i;
set VALUES = setof {<i,v> in NAMES_VALUES} v;
var Select {NAMES} binary;
min NumSelected = sum {i in NAMES} Select[i];
con Cover {v in VALUES}:
sum {<i,(v)> in NAMES_VALUES} Select[i] >= 1;
solve;
print {i in NAMES: Select[i].sol > 0.5} Select;
quit;
```

[1] Select

C 1

D 1

F 1

K 1

This is known as a "set covering problem" in the literature.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Did you modify @Ksharp's input (to correct the anomaly in set H) ? I get C-G-H-I with SAS/STAT 13.1 .

PG

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

No, I just used the "have" data set as is. The READ DATA statement automatically removed the duplicate:

WARNING: Duplicate key <H,11> was read at observation 51.

By the way, you can use the CLP solver to find all optimal solutions:

```
solve with CLP / findallsolns;
for {s in 1.._NSOL_} put ({i in NAMES: Select[i].sol[s] > 0.5});
```

It turns out that there are four in this case:

{'C','G','H','I'}

{'C','D','I','K'}

{'C','G','I','K'}

{'C','D','F','K'}

How did you solve it with SAS/STAT?

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I used SAS/OR 9.1, sorry. Yet, it didn't pick the same answer as yours. Is there something non-deterninistic in the algorithm?

PG

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi Team

I will be posting more detail on the SAS-L list, But it has 15 lines in a single data step(after removing checking arrays)?

There appear to be 4 solutions, kind of a lark so be suspecious. I learn more from my errors then my solutions.

see below

Here is one of them

3=c

4=d

6=f

11=k

Others are below.

The solution may not scale over 54 codes without using extended precision, no limit of number of vectors.

The only change from a previous post on SAS-L was to substitute 'bor' for 'sum'

Obs TOT E1 E2 E3 E4 D1 D2 D3 D4

1 2097150 136522 1155104 52246 1966992 3 4 6 11

2 2097150 136522 1155104 84172 1966992 3 4 9 11

3 2097150 136522 1090160 952320 84172 3 7 8 9

4 2097150 136522 1090160 84172 1966992 3 7 9 11

data x;

cost= 2**1+2**3+2**5+2**7+2**9+2**19+2**12 ; output;

cost= 2**2+2**4+2**6+2**8+2**10+2**16 ; output;

cost= 2**3+2**6+2**8+2**12+2**17+2**1+2**10 ; output;

cost= 2**5+2**13+2**20+2**15+2**16 ; output;

cost= 2**4+2**9+2**15+2**19+2**5+2**14 ; output;

cost= 2**11+2**2+2**4+2**15+2**1+2**14+2**10 ; output;

cost= 2**20+2**13+2**15+2**4+2**5+2**6+2**9 ; output;

cost= 2**19+2**18+2**17+2**11+2**15 ; output;

cost= 2**2+2**14+2**6+2**16+2**3+2**11+2**7 ; output;

cost= 2**1+2**12+2**3+2**13+2**4+2**14+2**18 ; output;

cost= 2**7+2**17+2**8+2**18+2**9+2**19+2**20+2**4; output;

run;

all = {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20} ;

%macro ngetk(n=11);

/* generate combinations of n items taken k at a time */

data Comb(keep=tot res d: e: where=(tot=2097150));

retain tot 0 res e1-e11 d1-d11;

array chksum[11] _temporary_ (529066,66900,136522,1155104,574000,52246,1090160,952320,84172,290842,1966992);

array d[11] d1-d11 (11*0);

array e[11] e1-e11 (11*0);

%do k=1 %to 11;

array c&k.f[&k] (&k.*0); /* initialize array to 0 */

ncomb = comb(&n, &k); /* number of combinations */

do j = 1 to ncomb;

rc = lexcombi(&n, &k, of c&k.f[*]);

do l=1 to &k;

tot=bor(tot, chksum[c&k.f[l]]);

d[l]=c&k.f[l];

e[l]=chksum[c&k.f[l]];

end;

res=put(tot,hex8.);

output;

tot=0;

end;

%end;

run;

%mend ngetk;

%ngetk;

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Roger,

I noticed you used lexcombi(). That means you can't handle large scale problem(i.e. have lots and lots of student) .

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I guess this means that the solution returned might depend on the number of threads involved in the search.

PG

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Yes, the solution can depend on the number of threads used.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Impressed. I want know how many students OR can handle ?

PG,

Yes. H is a typo . Sorry.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Fantastic kskarp. Thanks a lot.

You make me dive deeply into the GA subject as I am having a hard time to understand your code.

But finally I've come behind it and I would like to share my insights for others who might be struggling to understand this code.

As the link to the problem description is broken, I start with my auto-explanation of what is the task here.**Find a minimum set of persons (variable:names) so that their attributes (here numbers from 1 to 20, vector:value) represent the range of the population (vector:all).**

your code in black

group_n=t(loc(name^=t(remove(name,1)||{' '})))-

t(loc(name^=t({' '}||remove(name,nobs)))) +1; /*Create a count of values for each person. The "data have" step earlier ensures that it gets sorted by person. It cost me a hell to figure out what it does. It's a nice trick but for me it's easier to understand to get this via this loop:

u = unique(name); /* 2. Unique values (levels) of categorical variable. */

s = j(ncol(u),1); /* 3. Allocate vector to hold results */

do i = 1 to ncol(u); /* 4. For each level... */

idx = (name=u[i]); /* 5. indicator matrix */

s[i] = t(idx)*idx;/* 6. count */

end;

encoding=repeat({0,1},1,n); /*The encoding has ones and zeros and its size is equal to number of unique persons. One selects that person meanwhile 0 leaves it out from the selection.

For the sake of understanding I created an random selection vector and played around:

call randseed(**123**); /* set random number seed */

u = j(**1**,nrow(group_n),**1**); /* allocate */

call randgen(u, "BERNOULLI",**0.1**);

repeat(u,group_n); /* Create ones and zeros as of the size of the vector of attributes. First element A=1 second element A=3...

This puts zeros (or ones) to all person's elements.

t(repeat(x,group_n))#value /* The 1/0 vector gets multiplied elementwise with the attributes values vector prior transpose.

temp=setdif(t(repeat(x,group_n))#value,{0}); /*The setdif function cancels out zeros and leaves only unique entries. If 10 as value appears several times, then the setdif resulting vector has only 1 times 10.

if countunique(temp)=all_n then sum=sum(x); /*If and only if all values are present in the current selection then count the sum of persons chosen.

else sum=99999; /*As it's a minimization problem, else set sum to a very high value to penalize.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Congratulations!

Glad you could understand my GA code.

Enjoy Genetic Algorithm, you can concur any of optimal problem via it.

I know @RobPratt might don't think so . But I love GA .

**Don't miss out on SAS Innovate - Register now for the FREE Livestream!**

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.