Min Distance Value: |
---|
196 |
Hi Experts,
I need to minimize distance to customer location from Portfolio Managers' location. I have the following customer table-
Customers | Postcode | Sales |
Riya | 110051 | 98 |
Raj | 110007 | 45 |
Sam | 110032 | 72 |
Sompa | 110002 | 31 |
Predy | 110003 | 48 |
David | 110005 | 76 |
The following is the table that has Portfolio Manager Information with Territory/Region.
PM | Position | PM_Postcode | Territory |
A | Manager | 110007 | NorthDelhi |
B | Analyst | 110051 | EastDelhi |
C | Manager | 110032 | EastDelhi |
D | Analyst | 110003 | NorthDelhi |
There are some constraints - If sale is greater than 50, the customer should be handled by PM who is Manager. Otherwise Analyst. Within each territory, there is a constraint that manager can only handle up to 3 customers and analyst only up to 5 customers. I am sorry for a very short sample data to work on this problem. I hope the problem makes sense. I am very new to PROC OPTMODEL so it would be highly appreciated if you could provide the solution with example. Thanks in anticipation!
I presume your objective is, for each customer with over 50 sales, minimize distance to a manager, and for all other customers, to minimize distance to either analysit or manager.
I presume there are some customers in postal codes without analysts or managers. In that case you can't solve this problem without having distances between postal code. Show that data too.
Also, do any postalcodes have both a manager and an analyst? If so, then how do you assign small customers?
use function ZIPCITYDISTANCE(zip-code-1, zip-code-2) could get the distance between two zip. Your question is very tough. I would like to see how Rob could solve it by OR .
Thanks for your reply. If i ain't wrong ZIPCITYDISTANCE function works only for US postcodes. Anyway, i have measured the distance between locations by calculating Eucledian distance of the (x y) coordinates. I need to know how to use PROC OPTMODEL to solve this problem. Since it's an urgent request, any help would be highly appreciated.
" Within each territory, there is a constraint that manager can only handle up to 3 customers and analyst only up to 5 customers." I don't understand 'territory' . Why ? 'territory' has nothing to do with this constraint ? Maybe I could use GA to get it, but as Rob mentioned before, OR might be your best choice.
" Within each territory, there is a constraint that manager can only handle up to 3 customers and analyst only up to 5 customers." - I mean to say in each territory i need to maintain the ratio such that manager can serve max 3 customers whose sales or turnover >= (x amount). And Analyst can serve max 5 customers whose sales or turnover < (x amount). By territory, i mean 'Region / City'.
OK. Here is IML code of GA.
I also want see the OR code by @RobPratt ,that might your best shoot.
If you have big table GA can't guarantee you to get th optimal solution.
data have1;
infile cards truncover expandtabs;
input Customers $ Postcode $ sales x y;
cards;
Riya 110051 98 34 78
Raj 110007 45 65 65
Sam 110032 72 78 34
Sompa 110002 31 21 42
Predy 110003 48 47 78
David 110005 76 74 32
;
run;
data have2;
infile cards truncover expandtabs;
input PM $ Position $ PM_Postcode $ Territory : $20. x y;
cards;
A Manager 110007 NorthDelhi 23 54
B Analyst 110051 EastDelhi 34 45
C Manager 110032 EastDelhi 56 56
D Analyst 110003 NorthDelhi 43 88
;
run;
data customer;
set have1;
length position $ 10;
if sales gt 50 then position='Manager';
else position='Analyst';
run;
data pm;
set have2;
if position='Manager' then do;
do i=1 to 3;
output;
end;
end;
else do;
do i=1 to 5;
output;
end;
end;
drop i;
run;
proc sort data=customer;by position;run;
proc sort data=pm;by position;run;
proc iml;
use customer nobs nobs1;
read all var{position};
read all var{x y} into xy_cust;
read all var _char_ into all1[c=vnames1];
close;
n=uniqueby(position);
need_analyst=n[2]-1;
need_manager=nobs1-n[2]+1;
use pm nobs nobs2;
read all var{position};
read all var{x y} into xy_pm;
read all var _char_ into all2[c=vnames2];
close;
n=uniqueby(position);
have_analyst=n[2]-1;
have_manager=nobs2-n[2]+1;
if nobs1>nobs2 then abort 'ERROR: too many customers.';
start func(x)
global(have_analyst,have_manager,need_analyst,need_manager,xy_cust,xy_pm);
t1=x[1:need_analyst];
t2=x[(need_analyst+1):(need_analyst+need_manager)];
if countunique(t1)^=need_analyst | max(t1)>have_analyst |
countunique(t2)^=need_manager | min(t2)<(have_analyst+1)
then obj=999999;
else obj=sum(abs((xy_pm[x,]-xy_cust)));
return (obj);
finish;
n=need_analyst+need_manager;
encoding=repeat(1//have_analyst,1,need_analyst)||
repeat((have_analyst+1)//(have_analyst+have_manager),1,need_manager);
id=gasetup(2,n,12345678);
call gasetobj(id,0,"func");
call gasetcro(id,0.95,2);
call gasetmut(id,0.95,3);
call gasetsel(id,100,1,1);
call gainit(id,10000,encoding);
niter = 1000; /*change it as bigger as you can*/
do i = 1 to niter;
call garegen(id);
call gagetval(value, id);
end;
call gagetmem(mem, value, id, 1);
print value[l = "Min Distance Value:"],
all1[l="" c=vnames1] (all2[mem,])[l="" c=vnames2] ;
call gaend(id);
quit;
OUTPUT:
Min Distance Value: |
---|
196 |
Customers | Postcode | position | PM | Position | PM_Postcode | Territory |
---|---|---|---|---|---|---|
Raj | 110007 | Analyst | D | Analyst | 110003 | NorthDelhi |
Sompa | 110002 | Analyst | B | Analyst | 110051 | EastDelhi |
Predy | 110003 | Analyst | D | Analyst | 110003 | NorthDelhi |
Riya | 110051 | Manager | A | Manager | 110007 | NorthDelhi |
Sam | 110032 | Manager | C | Manager | 110032 | EastDelhi |
David | 110005 | Manager | C | Manager | 110032 | EastDelhi |
Thanks @Ksharp for taking out time to write this code. You are very helpful. God bless you! I'll test this code tomorrow and would keep you posted about it.
OK. Here is. Since @Rob post OR code, maybe you should that . data have1; infile cards truncover expandtabs; input Customers $ Postcode $ sales x y; cards; Riya 110051 98 34 78 Raj 110007 45 65 65 Sam 110032 72 78 34 Sompa 110002 31 21 42 Predy 110003 48 47 78 David 110005 76 74 32 ; run; data have2; infile cards truncover expandtabs; input PM $ Position $ PM_Postcode $ Territory : $20. x y; cards; A Manager 110007 NorthDelhi 23 54 B Analyst 110051 EastDelhi 34 45 C Manager 110032 EastDelhi 56 56 D Analyst 110003 NorthDelhi 43 88 ; run; data customer; set have1; length position $ 10; if sales gt 50 then position='Manager'; else position='Analyst'; run; data pm; set have2; if position='Manager' then do; do i=1 to 3; output; end; end; else do; do i=1 to 5; output; end; end; drop i; run; proc sort data=customer;by position;run; proc sort data=pm;by position;run; proc iml; use customer nobs nobs1; read all var{position}; read all var{x y} into xy_cust; read all var _char_ into all1[c=vnames1]; close; n=uniqueby(position); need_analyst=n[2]-1; need_manager=nobs1-n[2]+1; use pm nobs nobs2; read all var{position}; read all var{x y} into xy_pm; read all var _char_ into all2[c=vnames2]; close; n=uniqueby(position); have_analyst=n[2]-1; have_manager=nobs2-n[2]+1; if nobs1>nobs2 then abort 'ERROR: too many customers.'; start func(x) global(have_analyst,have_manager,need_analyst,need_manager,xy_cust,xy_pm); t1=x[1:need_analyst]; t2=x[(need_analyst+1):(need_analyst+need_manager)]; if countunique(t1)^=need_analyst | max(t1)>have_analyst | countunique(t2)^=need_manager | min(t2)<(have_analyst+1) then obj=999999; else obj=sum(abs((xy_pm[x,]-xy_cust))); return (obj); finish; n=need_analyst+need_manager; encoding=repeat(1//have_analyst,1,need_analyst)|| repeat((have_analyst+1)//(have_analyst+have_manager),1,need_manager); id=gasetup(2,n,12345678); call gasetobj(id,0,"func"); call gasetcro(id,0.95,2); call gasetmut(id,0.95,3); call gasetsel(id,10,1,1); call gainit(id,10000,encoding); niter = 1000; /*change it as bigger as you can*/ do i = 1 to niter; call garegen(id); call gagetval(value, id); end; call gagetmem(mem, value, id, 1); want=all1||all2[mem,] ; vnames1[ncol(vnames1)]='CPosition'; vnames=(vnames1||vnames2); create want from want[c=vnames] ; append from want; close; print value[l = "Min Distance Value:"]; call gaend(id); quit; proc print data=want noobs;run;
Here is PROC OPTMODEL code, using the same distance metric as @Ksharp:
proc optmodel;
set <str> CUSTOMERS;
num sales {CUSTOMERS};
num x_c {CUSTOMERS};
num y_c {CUSTOMERS};
read data have1 into CUSTOMERS=[customers] sales x_c=x y_c=y;
set <str> PORTFOLIO_MANAGERS;
str position {PORTFOLIO_MANAGERS};
num x_p {PORTFOLIO_MANAGERS};
num y_p {PORTFOLIO_MANAGERS};
read data have2 into PORTFOLIO_MANAGERS=[pm] position x_p=x y_p=y;
set CP_PAIRS =
{c in CUSTOMERS, p in PORTFOLIO_MANAGERS:
(sales[c] > 50 and position[p] = 'Manager') or (sales[c] <= 50 and position[p] = 'Analyst')};
num dist {<c,p> in CP_PAIRS} = abs(x_c[c] - x_p[p]) + abs(y_c[c] - y_p[p]);
var Assign {CP_PAIRS} binary;
min TotalDistance = sum {<c,p> in CP_PAIRS} dist[c,p] * Assign[c,p];
con OneManagerPerCustomer {c in CUSTOMERS}:
sum {<(c),p> in CP_PAIRS} Assign[c,p] = 1;
con Cardinality {p in PORTFOLIO_MANAGERS}:
sum {<c,(p)> in CP_PAIRS} Assign[c,p] <= (if position[p] = 'Manager' then 3 else 5);
solve;
print Assign;
quit;
SAS Output
Solution Summary | |
---|---|
Solver | MILP |
Algorithm | Branch and Cut |
Objective Function | TotalDistance |
Solution Status | Optimal |
Objective Value | 196 |
Relative Gap | 0 |
Absolute Gap | 0 |
Primal Infeasibility | 0 |
Bound Infeasibility | 0 |
Integer Infeasibility | 0 |
Best Bound | 196 |
Nodes | 0 |
Iterations | 0 |
Presolve Time | 0.00 |
Solution Time | 0.00 |
Assign | ||||
---|---|---|---|---|
A | B | C | D | |
David | 0 | 1 | ||
Predy | 0 | 1 | ||
Raj | 0 | 1 | ||
Riya | 1 | 0 | ||
Sam | 0 | 1 | ||
Sompa | 1 | 0 |
I should also point out that your problem decomposes into two completely disjoint problems, according to sales and position. A simple way to exploit that structure is to use the decomposition algorithm to detect and solve the two subproblems in parallel:
solve with MILP / decomp=(method=concomp);
Thanks a ton @RobPratt for your solution. At the same time, i apologize for so late reply. Your solution works like a charm on small dataset. I have learnt a lot from your solution as i'm new to PROC OPTMODEL. In my dataset, i have around 30k customers and approx. 700 PMs (including both 'Analyst' and 'Manager'). It throws an error 'Out of memory during solution resolution'. I understand it creates a cartesian product of variables which is quite high. It's using 4 threads during processing. Is there any workaround to fine tune the performance? I can comprimise the solution for making it run. Would you advise any performance tips? To workaround this issue, i have split data into 4 parts based on their region and checking distance within each region separately and later combining all the datasets. But it's not an optimal solution for some of the customers whose location is in the border areas of the region. They might have a nearest distance in the other region. Thanks in anticipation! Any help would be highly appreciated!
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Lock in the best rate now before the price increases on April 1.