SAS Optimization, and SAS Simulation Studio

turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

Highlighted
# Newbie question on using PROC OPTMODEL for simple two-index MILP minimization

Options

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-25-2017 04:29 PM

Hi!

I have a simple problem I'm trying to code in PROC OPTMODEL (SAS 9.4, SAS/OR 14.1 on Linux x64 grid server), instead of my usual method of directly creating the MPS data set for PROC OPTMILP with a SAS data step. But I'm striking out pretty badly, and thought I'd ask for assistance.

I have a data set with three variables, i j c, sorted on i and j (i < j), the i and j pairs are unique, c is the cost for each pair i and j. I'm choosing a set of least cost pairs, and each i or j value can occur in only one of the chosen pairs (it can't occur on the left for one pair and on the right for another). So, I want to create an array c[i,j] and binary decision variables v[i,j]. I want to minimize sum over i and j of c[i,j]*v[i,j], I want sum over i and j of v[i,j] to add up to a specified constant number, and I have constraints that sum over i of v[i,j] + sum over k of v[j,k] <= 1 for each j. I think this should be pretty simple to to do in OPTMODEL, but I haven't come close to getting anything to run. Any assistance would be appreciated. Thanks!

Accepted Solutions

Solution

05-25-2017
05:21 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-25-2017 04:58 PM

If I understand correctly, the following should do what you want (where I have used 3 as the "specified constant number"):

```
proc optmodel;
set <num,num> EDGES;
num c {EDGES};
read data indata into EDGES=[i j] c;
set NODES = union {<i,j> in EDGES} {i,j};
var V {EDGES} binary;
min TotalCost = sum {<i,j> in EDGES} c[i,j] * V[i,j];
con Cardinality:
sum {<i,j> in EDGES} V[i,j] = 3;
con Balance {j in NODES}:
sum {<i,(j)> in EDGES} V[i,j] + sum {<(j),k> in EDGES} V[j,k] <= 1;
solve;
print V;
create data outdata from [i j] c V;
quit;
```

For learning PROC OPTMODEL, you might find this examples book useful. In particular, this example illustrates a side-constrained network flow problem.

All Replies

Solution

05-25-2017
05:21 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-25-2017 04:58 PM

If I understand correctly, the following should do what you want (where I have used 3 as the "specified constant number"):

```
proc optmodel;
set <num,num> EDGES;
num c {EDGES};
read data indata into EDGES=[i j] c;
set NODES = union {<i,j> in EDGES} {i,j};
var V {EDGES} binary;
min TotalCost = sum {<i,j> in EDGES} c[i,j] * V[i,j];
con Cardinality:
sum {<i,j> in EDGES} V[i,j] = 3;
con Balance {j in NODES}:
sum {<i,(j)> in EDGES} V[i,j] + sum {<(j),k> in EDGES} V[j,k] <= 1;
solve;
print V;
create data outdata from [i j] c V;
quit;
```

For learning PROC OPTMODEL, you might find this examples book useful. In particular, this example illustrates a side-constrained network flow problem.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-25-2017 05:22 PM

You nailed it, RobPratt! Thank you so much!!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-20-2017 02:22 PM - edited 06-20-2017 02:27 PM

hey @RobPratt, I am struggling with a simple modification to the above code. I'm really really new to SAS so please do forgive me if this is something very simple -- also new to the forum; I've looked all over the web and this example is the closest I've found where the data is in the format which I've got and the problem statement is similar.

In my case I have different constraints: i have destinations(n=15Kish) and origins(100ish) (i=destinations and j=origins) and must have every i assigned to only one j (that is each customer is assigned to only one distribution site). The costs are drive times from the distribution hub to the customer and the objective is to minimize the drive times -- in short it's basically assigning a customer to it's associated site with the shortest drive time. heres the mod i attempted which isn't working for me; sas tells me that the solution is "infeasible" which seems a bit odd to me. in the simple case of 10 destinations and 3 distribution sites i also get the "infeasible" issue so I'm thinking it's my constraint line which is off.

specifically I used the following constraint code:

/*device assigned to only one site*/

con Balance {i in NODES}:

sum {<(i),j> in SITEStoDEVICES} V[i,j] = 1;

which is a modification of the constraints from the OP.

Without the constraint I get all V's as zero -- I think thats meaning no assignemnts are made which of course minimized drive time to be zero overall. However, I set this up to ensure C1-C5 are assigned to S1 while C6-C10 are assigned to S2 with non assigned to S3.

Please help! I'm just not understanding something simple here.

in short the overall project objective that I'm trying to achieve is

1) assign customers to the associated sites with lowest drive times

2) get the all customers outside of a specified drive time

3) take the group from (2) and determine where would be the best new sites to have as distribution hubs to ensure all customers are within the specified drive time.

next phase would be to optimize on lowest number of distro hubs such that all customers are within a certain drive time but thats outsie the scope of my imediate need.

below is the example data/code.

data DTMS;

input destination : $11. origin : $11. DriveTime;

datalines;

C1 S1 .15

C2 S1 .10

C3 S1 .11

C4 S1 .12

C5 S1 .13

C1 S2 5.15

C2 S2 5.10

C3 S2 5.11

C4 S2 5.12

C5 S2 5.13

C1 S2 7.15

C2 S3 7.10

C3 S3 7.11

C4 S3 7.12

C5 S3 7.13

C6 S1 5.15

C7 S1 5.10

C8 S1 5.11

C9 S1 5.12

C10 S1 5.13

C6 S2 .15

C7 S2 .10

C8 S2 .11

C9 S2 .12

C10 S2 .13

C6 S3 6.15

C7 S3 6.10

C8 S3 6.11

C9 S3 6.12

C10 S3 6.13

;

proc optmodel;

set <str, str> SITEStoDEVICES;

num DriveTime {SITEStoDEVICES};

read data DTMS into SITEStoDEVICES = [destination origin] DriveTime;

set NODES = union {<i,j> in SITEStoDEVICES} {i,j};

/*assignment variable*/

var V {SITEStoDEVICES} binary;

/*objective function*/

min TotalCost = sum {<i,j> in SITEStoDEVICES} DriveTime[i,j] * V[i,j];

/*device assigned to only one site*/

con Balance {i in NODES}:

sum {<(i),j> in SITEStoDEVICES} V[i,j] = 1;

solve;

/* print V;*/

create data outdata from [i j] DriveTime V;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-20-2017 03:09 PM

This doc example is closer to what you want. A big difference between your problem and the one from @Top_Katz is that you have a bipartite network, and you need to treat the two sets of nodes differently.

The following two changes will accomplish your first step:

```
/*set NODES = union {<i,j> in SITEStoDEVICES} {i,j};*/
set CUSTOMERS = setof {<i,j> in SITEStoDEVICES} i;
set SITES = setof {<i,j> in SITEStoDEVICES} j;
/* con Balance {i in NODES}:*/
con Balance {i in CUSTOMERS}:
sum {<(i),j> in SITEStoDEVICES} V[i,j] = 1;
```

But you do not actually need to solve an optimization problem to do that step. You can just explicitly compute the minimum for each customer separately:

```
str argminDriveTime {CUSTOMERS};
num minDriveTime;
for {i in CUSTOMERS} do;
minDriveTime = constant('BIG');
for {<(i),j> in SITEStoDEVICES} do;
if minDriveTime > DriveTime[i,j] then do;
minDriveTime = DriveTime[i,j];
argminDriveTime[i] = j;
end;
end;
V[i,argminDriveTime[i]] = 1;
end;
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-20-2017 03:25 PM - edited 06-20-2017 03:27 PM

wow that was quick. Thank you @RobPratt; much appreciated. The solution worked for my simple case and I'm figuring it'll do for the full scope case.

i see where you're going with the straight up minimum selection for this simple situation. however, i needed to know how to make the optimization portion work for the second phase of the project which is coming up.

the issue i ran into with the referenced example is this portion of the code

/* distance from customer i to site j */ num dist {i in CUSTOMERS, j in SITES} = sqrt((x[i] - x[j])^2 + (y[i] - y[j])^2); read data cdata into CUSTOMERS=[name] x y demand; read data sdata into SITES=[name] x y fixed_charge;

specifically I ran into was the fact that my data is vertically strutured with all the customer/site combos with their associated drive times. The example assumed two data sets, one for customers and one for sites with both having x/y coordinants and the "cost" is distance which is calculable using the euclidian metric; I was able to easily enough chage that to use lat/longs and geo distances but distances doen't equate to drive times.

I wasn't sure how to modify the code to do a "group by" sort of thing with a single data frame.

I could easily enough get the data into three frames where I had customers, sites, and then drive times in a third frame but wasn't sure how to get the dist variable from the third frame by looking up the customer / site combo; and unfortunantly the drive times is not a calculable metric given the lat/longs of my customers/sites.

I would love to see how you'd modify the referenced code to do either

1) a look up for the dist variable from a third data frame

and/or

2) use a single data frame with customer/site combos and a seperate third column with the "dist" or cost variable.

being so new SAS for these types of problems has really been fun but challenging. Your help has been infinately valuable and I thank you again!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-20-2017 03:35 PM

Glad to help.

Your original READ DATA statement is already perfectly suited for the case that the drive times come from a sparse table instead of from a formula.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-20-2017 03:49 PM

good to know. I'll play around with it now that I've seen a working example and should be able to self serve for the needed finale!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-28-2017 12:13 PM - edited 08-28-2017 12:13 PM

Good Day Rob,

As a follow up to this question I am now attempting to place a constraint on the model in which certain sites are known to be built for this facility location problem. In short, we know from a business persepective specific locations must be built and want the remaining locations to be greefield solutioned.

Here's what I've got.

Variable in my data frame "REQ_LOC_BY_MGMT_NUM" has 1's where the site is a known location to build and 0's otherwise.

I've tried the following:

added to the preamble of the code

num REQ_LOC_BY_MGMT_NUM {SITEStoDEVICES};

added to the objective function statements

min TotalDriveTime = sum {<customerName, siteName> in SITEStoDEVICES} DriveTimeMin[customerName, siteName] * V[customerName, siteName];

min totalbuilds = sum {<deviceSiteId, techSiteId> in SITEStoDEVICES} Build[techSiteId];

and new constraint

/* must build specific sites */

con mustbuild {deviceSiteId in CUSTOMERS, techSiteId in SITES}:

REQ_LOC_BY_MGMT_NUM[deviceSiteId,techSiteId] <= Build[techSiteId];

it's throwing the error:

ERROR: The array subscript 'V['xxxxxxxxxxxxx','xxxxxxxxxxxxxxxxx']' is invalid at line 53 column 7.

Thoughts on how to make this work? It's probably me just not understainding the example from here:

Thanks again!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-28-2017 03:01 PM

It sounds to me like you want the following:

```
num REQ_LOC_BY_MGMT_NUM {SITES};
con mustbuild {techSiteId in SITES: REQ_LOC_BY_MGMT_NUM[techSiteId] = 1}:
Build[techSiteId] = 1;
```

Or you can use the FIX statement instead of declaring a single-variable constraint:

```
for {techSiteId in SITES: REQ_LOC_BY_MGMT_NUM[techSiteId] = 1}
fix Build[techSiteId] = 1;
```

If these changes don't resolve your errors, please share the full code and data.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-28-2017 03:39 PM

Thanks for the quick reply Rob. I'm running into errors when trying to read the data "REQ_LOC_BY_MGMT_NUM" into the model initially.

I've placed the statement "num REQ_LOC_BY_MGMT_NUM {SITES};" both before and after the SITES node creation. in both cases I get an error. The first case I get the error

ERROR 525-782: The symbol 'SITES' is unknown.

and in the second case (placement after the SITES node create)

ERROR 525-782: The symbol 'REQ_LOC_BY_MGMT_NUM' is unknown.

Specifically an example of what I've got is here:

/* Data Example */

**data** DTMS;

input destination : $11.

origin : $11.

REQ_LOC_BY_MGMT_NUM : best8.

DriveTime ;

datalines;

C1 S1 0 .15

C2 S1 0 .10

C3 S1 0 .11

C4 S1 0 .12

C5 S1 0 .13

C1 S2 0 5.15

C2 S2 0 5.10

C3 S2 0 5.11

C4 S2 0 5.12

C5 S2 0 5.13

C1 S3 0 7.15

C2 S3 1 7.10

C3 S3 1 7.11

C4 S3 1 7.12

C5 S3 1 7.13

C6 S1 0 5.15

C7 S1 0 5.10

C8 S1 0 5.11

C9 S1 0 5.12

C10 S1 0 5.13

C6 S2 0 .15

C7 S2 0 .10

C8 S2 0 .11

C9 S2 0 .12

C10 S2 0 .13

C6 S3 1 6.15

C7 S3 1 6.10

C8 S3 1 6.11

C9 S3 1 6.12

C10 S3 1 6.13

;

**proc** **optmodel**;

set <str, str> SITEStoDEVICES;

num DriveTime {SITEStoDEVICES};

read data DTMS into SITEStoDEVICES = [destination origin] DriveTime REQ_LOC_BY_MGMT_NUM;

set CUSTOMERS = setof {<i,j> in SITEStoDEVICES} i;

set SITES = setof {<i,j> in SITEStoDEVICES} j;

num REQ_LOC_BY_MGMT_NUM {SITES};

/*assignment variable*/

var V {SITEStoDEVICES} binary;

var Build {SITEStoDEVICES} binary;

/* objective functions */

min TotalCost = sum {<i,j> in SITEStoDEVICES} DriveTime[i,j] * V[i,j];

min totalbuilds = sum {<deviceSiteId, techSiteId> in SITEStoDEVICES} Build[techSiteId];

/*device assigned to only one site*/

con Balance {i in CUSTOMERS}:

sum {<(i),j> in SITEStoDEVICES} V[i,j] = **1**;

/* must build sites with assigned locations */

con link {deviceSiteId in CUSTOMERS, techSiteId in SITES}:

V[deviceSiteId,techSiteId] <= Build[techSiteId];

/*must build certain sites*/

con mustbuild {techSiteId in SITES: REQ_LOC_BY_MGMT_NUM[techSiteId] = **1**}:

Build[techSiteId] = **1**;

/*for {techSiteId in SITES: REQ_LOC_BY_MGMT_NUM[techSiteId] = 1}*/

/* fix Build[techSiteId] = 1;*/

solve;

/* print V;*/

create data outdata from [i j] DriveTime V Build REQ_LOC_BY_MGMT_NUM;

**run**;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-28-2017 03:56 PM

the line

var Build {SITEStoDEVICES} binary;

should read

var Build {SITES} binary;

in the example code. My apologies for being unclear there.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-28-2017 03:59 PM - edited 08-28-2017 04:10 PM

You should use a separate data set for REQ_LOC_BY_MGMT_NUM:

```
data SiteData;
input site $ REQ_LOC_BY_MGMT_NUM;
datalines;
S1 0
S2 0
S3 1
;
```

And then you can read it as follows:

```
read data SiteData into [site] REQ_LOC_BY_MGMT_NUM;
```

The Build variable needs to be indexed over SITES:

```
var Build {SITES} binary;
```

You have declared two objectives. By default, only the second one is used. Because Build depends only on site, I think the following makes more sense:

`min totalbuilds = sum {techSiteId in SITES} Build[techSiteId];`

You should also create two separate output data sets:

```
create data outdata from [i j] DriveTime V;
create data outdata2 from [j] Build REQ_LOC_BY_MGMT_NUM;
```

Or use an explicit [j] index as follows:

```
create data outdata from [i j] DriveTime V Build[j] REQ_LOC_BY_MGMT_NUM[j];
```

Finally, replace the RUN with QUIT.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-28-2017 05:24 PM

Thanks for the help Rob. With the attempt to minimize number of builds based on both minimizing drivetimes and the known locations to build I came up with this. If this looks off to you let me know but it appears to work with the following data as expected (sites 1, 2, and 3 are built with the only customer assiged to site 3 being C11).

```
proc optmodel;
set <str, str> SITEStoDEVICES;
num DriveTime {SITEStoDEVICES};
read data DTMS into SITEStoDEVICES = [destination origin] DriveTime;
set CUSTOMERS = setof {<i,j> in SITEStoDEVICES} i;
/*set SITES = setof {<i,j> in SITEStoDEVICES} j;*/
/*assignment variable*/
var V {SITEStoDEVICES} binary;
set <str> SITES;
num REQ_LOC_BY_MGMT_NUM {SITES};
read data SiteData into SITES = [site] REQ_LOC_BY_MGMT_NUM;
/* build variable*/
var Build {SITES} binary;
/* objective functions */
/* min TotalCost = sum {<i,j> in SITEStoDEVICES} DriveTime[i,j] * V[i,j];*/
/* min totalbuilds = sum {<deviceSiteId, techSiteId> in SITEStoDEVICES} Build[techSiteId];*/
min cost = sum {i in CUSTOMERS, j in SITES} V[i,j] * DriveTime[i,j];
min totalcost = cost + sum{j in SITES} Build[j];
/*min totalbuilds = sum {<deviceSiteId, techSiteId> in SITEStoDEVICES} Build[techSiteId]*DriveTime[deviceSiteId, techSiteId];*/
/*device assigned to only one site and every device is assigned*/
con Balance {i in CUSTOMERS}:
sum {<(i),j> in SITEStoDEVICES} V[i,j] = 1;
/* must build sites */
con mustbuild {techSiteId in SITES: REQ_LOC_BY_MGMT_NUM[techSiteId] = 1}:
Build[techSiteId] = 1;
/* must build sites with assigned locations */
con link {deviceSiteId in CUSTOMERS, techSiteId in SITES}:
V[deviceSiteId,techSiteId] <= Build[techSiteId];
/* do not assign customer to site more than x units away */
con distance_at_most {i in CUSTOMERS, j in SITES: DriveTime[i,j] > 1.99}:
V[i,j] = 0;
/* solve;*/
solve with milp/timetype=real;
create data outdata from [i j] DriveTime V Build[j] REQ_LOC_BY_MGMT_NUM[j];
run;
```

```
data SiteData;
input site $ REQ_LOC_BY_MGMT_NUM;
datalines;
S1 0
S2 0
S3 1
S4 0
;
```

```
data DTMS;
input destination : $11.
origin : $11.
REQ_LOC_BY_MGMT_NUM : best8.
DriveTime ;
datalines;
C1 S1 0 .15
C2 S1 0 .10
C3 S1 0 .11
C4 S1 0 .12
C5 S1 0 .13
C6 S1 0 2.15
C7 S1 0 2.10
C8 S1 0 2.11
C9 S1 0 2.12
C10 S1 0 2.13
C11 S1 0 1.99
C1 S2 0 2.75
C2 S2 0 2.70
C3 S2 0 2.71
C4 S2 0 2.72
C5 S2 0 2.73
C6 S2 0 .15
C7 S2 0 .10
C8 S2 0 .11
C9 S2 0 .12
C10 S2 0 .13
C11 S2 0 2.99
C1 S3 1 7.15
C2 S3 1 7.10
C3 S3 1 7.11
C4 S3 1 7.12
C5 S3 1 7.13
C6 S3 1 6.15
C7 S3 1 6.10
C8 S3 1 6.11
C9 S3 1 6.12
C10 S3 1 6.13
C11 S3 1 1.05
C1 S4 0 0.15
C2 S4 0 0.10
C3 S4 0 0.11
C4 S4 0 0.12
C5 S4 0 7.13
C6 S4 0 0.15
C7 S4 0 0.10
C8 S4 0 0.11
C9 S4 0 7.12
C10 S4 0 7.13
C11 S4 0 8.79
;
```