SAS Optimization, and SAS Simulation Studio

turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

a month ago

Hi!

I have a simple problem I'm trying to code in PROC OPTMODEL (SAS 9.4, SAS/OR 14.1 on Linux x64 grid server), instead of my usual method of directly creating the MPS data set for PROC OPTMILP with a SAS data step. But I'm striking out pretty badly, and thought I'd ask for assistance.

I have a data set with three variables, i j c, sorted on i and j (i < j), the i and j pairs are unique, c is the cost for each pair i and j. I'm choosing a set of least cost pairs, and each i or j value can occur in only one of the chosen pairs (it can't occur on the left for one pair and on the right for another). So, I want to create an array c[i,j] and binary decision variables v[i,j]. I want to minimize sum over i and j of c[i,j]*v[i,j], I want sum over i and j of v[i,j] to add up to a specified constant number, and I have constraints that sum over i of v[i,j] + sum over k of v[j,k] <= 1 for each j. I think this should be pretty simple to to do in OPTMODEL, but I haven't come close to getting anything to run. Any assistance would be appreciated. Thanks!

Accepted Solutions

Solution

a month ago

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

a month ago

If I understand correctly, the following should do what you want (where I have used 3 as the "specified constant number"):

```
proc optmodel;
set <num,num> EDGES;
num c {EDGES};
read data indata into EDGES=[i j] c;
set NODES = union {<i,j> in EDGES} {i,j};
var V {EDGES} binary;
min TotalCost = sum {<i,j> in EDGES} c[i,j] * V[i,j];
con Cardinality:
sum {<i,j> in EDGES} V[i,j] = 3;
con Balance {j in NODES}:
sum {<i,(j)> in EDGES} V[i,j] + sum {<(j),k> in EDGES} V[j,k] <= 1;
solve;
print V;
create data outdata from [i j] c V;
quit;
```

For learning PROC OPTMODEL, you might find this examples book useful. In particular, this example illustrates a side-constrained network flow problem.

All Replies

Solution

a month ago

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

a month ago

If I understand correctly, the following should do what you want (where I have used 3 as the "specified constant number"):

```
proc optmodel;
set <num,num> EDGES;
num c {EDGES};
read data indata into EDGES=[i j] c;
set NODES = union {<i,j> in EDGES} {i,j};
var V {EDGES} binary;
min TotalCost = sum {<i,j> in EDGES} c[i,j] * V[i,j];
con Cardinality:
sum {<i,j> in EDGES} V[i,j] = 3;
con Balance {j in NODES}:
sum {<i,(j)> in EDGES} V[i,j] + sum {<(j),k> in EDGES} V[j,k] <= 1;
solve;
print V;
create data outdata from [i j] c V;
quit;
```

For learning PROC OPTMODEL, you might find this examples book useful. In particular, this example illustrates a side-constrained network flow problem.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

a month ago

You nailed it, RobPratt! Thank you so much!!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Tuesday - last edited Tuesday

hey @RobPratt, I am struggling with a simple modification to the above code. I'm really really new to SAS so please do forgive me if this is something very simple -- also new to the forum; I've looked all over the web and this example is the closest I've found where the data is in the format which I've got and the problem statement is similar.

In my case I have different constraints: i have destinations(n=15Kish) and origins(100ish) (i=destinations and j=origins) and must have every i assigned to only one j (that is each customer is assigned to only one distribution site). The costs are drive times from the distribution hub to the customer and the objective is to minimize the drive times -- in short it's basically assigning a customer to it's associated site with the shortest drive time. heres the mod i attempted which isn't working for me; sas tells me that the solution is "infeasible" which seems a bit odd to me. in the simple case of 10 destinations and 3 distribution sites i also get the "infeasible" issue so I'm thinking it's my constraint line which is off.

specifically I used the following constraint code:

/*device assigned to only one site*/

con Balance {i in NODES}:

sum {<(i),j> in SITEStoDEVICES} V[i,j] = 1;

which is a modification of the constraints from the OP.

Without the constraint I get all V's as zero -- I think thats meaning no assignemnts are made which of course minimized drive time to be zero overall. However, I set this up to ensure C1-C5 are assigned to S1 while C6-C10 are assigned to S2 with non assigned to S3.

Please help! I'm just not understanding something simple here.

in short the overall project objective that I'm trying to achieve is

1) assign customers to the associated sites with lowest drive times

2) get the all customers outside of a specified drive time

3) take the group from (2) and determine where would be the best new sites to have as distribution hubs to ensure all customers are within the specified drive time.

next phase would be to optimize on lowest number of distro hubs such that all customers are within a certain drive time but thats outsie the scope of my imediate need.

below is the example data/code.

data DTMS;

input destination : $11. origin : $11. DriveTime;

datalines;

C1 S1 .15

C2 S1 .10

C3 S1 .11

C4 S1 .12

C5 S1 .13

C1 S2 5.15

C2 S2 5.10

C3 S2 5.11

C4 S2 5.12

C5 S2 5.13

C1 S2 7.15

C2 S3 7.10

C3 S3 7.11

C4 S3 7.12

C5 S3 7.13

C6 S1 5.15

C7 S1 5.10

C8 S1 5.11

C9 S1 5.12

C10 S1 5.13

C6 S2 .15

C7 S2 .10

C8 S2 .11

C9 S2 .12

C10 S2 .13

C6 S3 6.15

C7 S3 6.10

C8 S3 6.11

C9 S3 6.12

C10 S3 6.13

;

proc optmodel;

set <str, str> SITEStoDEVICES;

num DriveTime {SITEStoDEVICES};

read data DTMS into SITEStoDEVICES = [destination origin] DriveTime;

set NODES = union {<i,j> in SITEStoDEVICES} {i,j};

/*assignment variable*/

var V {SITEStoDEVICES} binary;

/*objective function*/

min TotalCost = sum {<i,j> in SITEStoDEVICES} DriveTime[i,j] * V[i,j];

/*device assigned to only one site*/

con Balance {i in NODES}:

sum {<(i),j> in SITEStoDEVICES} V[i,j] = 1;

solve;

/* print V;*/

create data outdata from [i j] DriveTime V;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Tuesday

This doc example is closer to what you want. A big difference between your problem and the one from @Top_Katz is that you have a bipartite network, and you need to treat the two sets of nodes differently.

The following two changes will accomplish your first step:

```
/*set NODES = union {<i,j> in SITEStoDEVICES} {i,j};*/
set CUSTOMERS = setof {<i,j> in SITEStoDEVICES} i;
set SITES = setof {<i,j> in SITEStoDEVICES} j;
/* con Balance {i in NODES}:*/
con Balance {i in CUSTOMERS}:
sum {<(i),j> in SITEStoDEVICES} V[i,j] = 1;
```

But you do not actually need to solve an optimization problem to do that step. You can just explicitly compute the minimum for each customer separately:

```
str argminDriveTime {CUSTOMERS};
num minDriveTime;
for {i in CUSTOMERS} do;
minDriveTime = constant('BIG');
for {<(i),j> in SITEStoDEVICES} do;
if minDriveTime > DriveTime[i,j] then do;
minDriveTime = DriveTime[i,j];
argminDriveTime[i] = j;
end;
end;
V[i,argminDriveTime[i]] = 1;
end;
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Tuesday - last edited Tuesday

wow that was quick. Thank you @RobPratt; much appreciated. The solution worked for my simple case and I'm figuring it'll do for the full scope case.

i see where you're going with the straight up minimum selection for this simple situation. however, i needed to know how to make the optimization portion work for the second phase of the project which is coming up.

the issue i ran into with the referenced example is this portion of the code

/* distance from customer i to site j */ num dist {i in CUSTOMERS, j in SITES} = sqrt((x[i] - x[j])^2 + (y[i] - y[j])^2); read data cdata into CUSTOMERS=[name] x y demand; read data sdata into SITES=[name] x y fixed_charge;

specifically I ran into was the fact that my data is vertically strutured with all the customer/site combos with their associated drive times. The example assumed two data sets, one for customers and one for sites with both having x/y coordinants and the "cost" is distance which is calculable using the euclidian metric; I was able to easily enough chage that to use lat/longs and geo distances but distances doen't equate to drive times.

I wasn't sure how to modify the code to do a "group by" sort of thing with a single data frame.

I could easily enough get the data into three frames where I had customers, sites, and then drive times in a third frame but wasn't sure how to get the dist variable from the third frame by looking up the customer / site combo; and unfortunantly the drive times is not a calculable metric given the lat/longs of my customers/sites.

I would love to see how you'd modify the referenced code to do either

1) a look up for the dist variable from a third data frame

and/or

2) use a single data frame with customer/site combos and a seperate third column with the "dist" or cost variable.

being so new SAS for these types of problems has really been fun but challenging. Your help has been infinately valuable and I thank you again!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Tuesday

Glad to help.

Your original READ DATA statement is already perfectly suited for the case that the drive times come from a sparse table instead of from a formula.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Tuesday

good to know. I'll play around with it now that I've seen a working example and should be able to self serve for the needed finale!