Re: Optimize Traveling Distance

Ujjawal · Posted 11-02-2016 02:24 PM

Hi Experts,

I need to minimize distance to customer location from Portfolio Managers' location. I have the following customer table-

Customers	Postcode	Sales
Riya	110051	98
Raj	110007	45
Sam	110032	72
Sompa	110002	31
Predy	110003	48
David	110005	76

The following is the table that has Portfolio Manager Information with Territory/Region.

PM	Position	PM_Postcode	Territory
A	Manager	110007	NorthDelhi
B	Analyst	110051	EastDelhi
C	Manager	110032	EastDelhi
D	Analyst	110003	NorthDelhi

There are some constraints - If sale is greater than 50, the customer should be handled by PM who is Manager. Otherwise Analyst. Within each territory, there is a constraint that manager can only handle up to 3 customers and analyst only up to 5 customers. I am sorry for a very short sample data to work on this problem. I hope the problem makes sense. I am very new to PROC OPTMODEL so it would be highly appreciated if you could provide the solution with example. Thanks in anticipation!

mkeintz · Posted 11-02-2016 03:53 PM

I presume your objective is, for each customer with over 50 sales, minimize distance to a manager, and for all other customers, to minimize distance to either analysit or manager.

I presume there are some customers in postal codes without analysts or managers. In that case you can't solve this problem without having distances between postal code. Show that data too.

Also, do any postalcodes have both a manager and an analyst? If so, then how do you assign small customers?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Ksharp · Posted 11-03-2016 01:57 AM

use function ZIPCITYDISTANCE(zip-code-1, zip-code-2)
could get the distance between two zip.

Your question is very tough. I would like to see how Rob could solve it by OR .

Ujjawal · Posted 11-03-2016 02:04 PM

Thanks for your reply. If i ain't wrong ZIPCITYDISTANCE function works only for US postcodes. Anyway, i have measured the distance between locations by calculating Eucledian distance of the (x y) coordinates. I need to know how to use PROC OPTMODEL to solve this problem. Since it's an urgent request, any help would be highly appreciated.

Ksharp · Posted 11-03-2016 10:16 PM

" Within each territory, there is a constraint that manager can only handle up to 3 customers and analyst only up to 5 customers."
I don't understand 'territory' . Why ? 'territory' has nothing to do with this constraint ?

Maybe I could use GA to get it, but as Rob mentioned before, OR might be your best choice.

Ujjawal · Posted 11-04-2016 04:44 AM

" Within each territory, there is a constraint that manager can only handle up to 3 customers and analyst only up to 5 customers." - I mean to say in each territory i need to maintain the ratio such that manager can serve max 3 customers whose sales or turnover >= (x amount). And Analyst can serve max 5 customers whose sales or turnover < (x amount). By territory, i mean 'Region / City'.

Ksharp · Posted 11-04-2016 06:41 AM

That territory doesn't mean anything, just could say manager can serve max 3 customers,and Analyst can serve max 5 customers ?

Ksharp · Posted 11-04-2016 09:08 AM

OK. Here is IML code of GA.

I also want see the OR code by @RobPratt ,that might your best shoot.

If you have big table GA can't guarantee you to get th optimal solution.

data have1;
infile cards truncover expandtabs;
input Customers	$ Postcode $ sales x y;
cards;
Riya	110051	98 34 78 
Raj	110007	45 65 65
Sam	110032	72 78 34 
Sompa	110002	31 21 42
Predy	110003	48 47 78
David	110005	76 74 32
;
run;
 
data have2;
infile cards truncover expandtabs;
input PM $ Position $ PM_Postcode $ Territory : $20. x y;
cards;
A	Manager	110007	NorthDelhi 23 54
B	Analyst	110051	EastDelhi 34 45
C	Manager	110032	EastDelhi 56 56
D	Analyst	110003	NorthDelhi 43 88
;
run;

data customer;
 set have1;
 length position $ 10;
 if sales gt 50 then position='Manager';
  else position='Analyst';
run;
data pm;
 set have2;
 if position='Manager' then do;
  do i=1 to 3;
   output;
  end;
 end;
 else do;
  do i=1 to 5;
   output;
  end;
 end;
drop i;
run;

proc sort data=customer;by position;run;
proc sort data=pm;by position;run;

proc iml;
use customer nobs nobs1;
read all var{position};
read all var{x y} into xy_cust;
read all var _char_ into all1[c=vnames1];
close;
n=uniqueby(position);
need_analyst=n[2]-1;
need_manager=nobs1-n[2]+1;

use pm nobs nobs2;
read all var{position};
read all var{x y} into xy_pm;
read all var _char_ into all2[c=vnames2];
close;
n=uniqueby(position);
have_analyst=n[2]-1;
have_manager=nobs2-n[2]+1;

if nobs1>nobs2 then abort 'ERROR: too many customers.';

start func(x) 
global(have_analyst,have_manager,need_analyst,need_manager,xy_cust,xy_pm); 
t1=x[1:need_analyst];
t2=x[(need_analyst+1):(need_analyst+need_manager)];
if countunique(t1)^=need_analyst |  max(t1)>have_analyst |
   countunique(t2)^=need_manager |  min(t2)<(have_analyst+1)
 then obj=999999;
else obj=sum(abs((xy_pm[x,]-xy_cust)));
 return (obj);
finish;


n=need_analyst+need_manager;
encoding=repeat(1//have_analyst,1,need_analyst)||
repeat((have_analyst+1)//(have_analyst+have_manager),1,need_manager);


id=gasetup(2,n,12345678);
call gasetobj(id,0,"func");
call gasetcro(id,0.95,2);
call gasetmut(id,0.95,3);
call gasetsel(id,100,1,1);
call gainit(id,10000,encoding); 

niter = 1000;  /*change it as bigger as you can*/
do i = 1 to niter;
 call garegen(id);
 call gagetval(value, id);
end;
call gagetmem(mem, value, id, 1);

print value[l = "Min Distance Value:"],
      all1[l="" c=vnames1] (all2[mem,])[l="" c=vnames2] ;
call gaend(id);
quit;

OUTPUT:

Min Distance Value:
196

Customers	Postcode	position	PM	Position	PM_Postcode	Territory
Raj	110007	Analyst	D	Analyst	110003	NorthDelhi
Sompa	110002	Analyst	B	Analyst	110051	EastDelhi
Predy	110003	Analyst	D	Analyst	110003	NorthDelhi
Riya	110051	Manager	A	Manager	110007	NorthDelhi
Sam	110032	Manager	C	Manager	110032	EastDelhi
David	110005	Manager	C	Manager	110032	EastDelhi

Ujjawal · Posted 11-05-2016 08:13 AM

Thanks @Ksharp for taking out time to write this code. You are very helpful. God bless you! I'll test this code tomorrow and would keep you posted about it.

Ujjawal · Posted 11-07-2016 09:52 AM

Hi,

Thank you so much for your solution. You are very helpful. I appreciate it very much. May god bless you! Would you mind tweaking the code and create data set containing output of optimization? I was exploring create data statement to output the result but it's throwing an error. Couldn't define Array in create data statement. Thanks in anticipation!

Ksharp · Posted 11-07-2016 10:05 PM

OK. Here is. Since @Rob post OR code, maybe you should that .



data have1;
infile cards truncover expandtabs;
input Customers	$ Postcode $ sales x y;
cards;
Riya	110051	98 34 78 
Raj	110007	45 65 65
Sam	110032	72 78 34 
Sompa	110002	31 21 42
Predy	110003	48 47 78
David	110005	76 74 32
;
run;
 
data have2;
infile cards truncover expandtabs;
input PM $ Position $ PM_Postcode $ Territory : $20. x y;
cards;
A	Manager	110007	NorthDelhi 23 54
B	Analyst	110051	EastDelhi 34 45
C	Manager	110032	EastDelhi 56 56
D	Analyst	110003	NorthDelhi 43 88
;
run;

data customer;
 set have1;
 length position $ 10;
 if sales gt 50 then position='Manager';
  else position='Analyst';
run;
data pm;
 set have2;
 if position='Manager' then do;
  do i=1 to 3;
   output;
  end;
 end;
 else do;
  do i=1 to 5;
   output;
  end;
 end;
drop i;
run;

proc sort data=customer;by position;run;
proc sort data=pm;by position;run;

proc iml;
use customer nobs nobs1;
read all var{position};
read all var{x y} into xy_cust;
read all var _char_ into all1[c=vnames1];
close;
n=uniqueby(position);
need_analyst=n[2]-1;
need_manager=nobs1-n[2]+1;

use pm nobs nobs2;
read all var{position};
read all var{x y} into xy_pm;
read all var _char_ into all2[c=vnames2];
close;
n=uniqueby(position);
have_analyst=n[2]-1;
have_manager=nobs2-n[2]+1;

if nobs1>nobs2 then abort 'ERROR: too many customers.';

start func(x) 
global(have_analyst,have_manager,need_analyst,need_manager,xy_cust,xy_pm); 
t1=x[1:need_analyst];
t2=x[(need_analyst+1):(need_analyst+need_manager)];
if countunique(t1)^=need_analyst |  max(t1)>have_analyst |
   countunique(t2)^=need_manager |  min(t2)<(have_analyst+1)
 then obj=999999;
else obj=sum(abs((xy_pm[x,]-xy_cust)));
 return (obj);
finish;


n=need_analyst+need_manager;
encoding=repeat(1//have_analyst,1,need_analyst)||
repeat((have_analyst+1)//(have_analyst+have_manager),1,need_manager);


id=gasetup(2,n,12345678);
call gasetobj(id,0,"func");
call gasetcro(id,0.95,2);
call gasetmut(id,0.95,3);
call gasetsel(id,10,1,1);
call gainit(id,10000,encoding); 

niter = 1000;  /*change it as bigger as you can*/
do i = 1 to niter;
 call garegen(id);
 call gagetval(value, id);
end;
call gagetmem(mem, value, id, 1);

want=all1||all2[mem,] ;
vnames1[ncol(vnames1)]='CPosition';
vnames=(vnames1||vnames2);
create want from want[c=vnames] ;
append from want;
close;
print value[l = "Min Distance Value:"];
call gaend(id);
quit;
proc print data=want noobs;run;

RobPratt · Posted 11-07-2016 12:12 PM

Here is PROC OPTMODEL code, using the same distance metric as @Ksharp:

proc optmodel;
   set <str> CUSTOMERS;
   num sales {CUSTOMERS};
   num x_c {CUSTOMERS};
   num y_c {CUSTOMERS};
   read data have1 into CUSTOMERS=[customers] sales x_c=x y_c=y;

   set <str> PORTFOLIO_MANAGERS;
   str position {PORTFOLIO_MANAGERS};
   num x_p {PORTFOLIO_MANAGERS};
   num y_p {PORTFOLIO_MANAGERS};
   read data have2 into PORTFOLIO_MANAGERS=[pm] position x_p=x y_p=y;

   set CP_PAIRS = 
      {c in CUSTOMERS, p in PORTFOLIO_MANAGERS: 
      (sales[c] > 50 and position[p] = 'Manager') or (sales[c] <= 50 and position[p] = 'Analyst')};
   num dist {<c,p> in CP_PAIRS} = abs(x_c[c] - x_p[p]) + abs(y_c[c] - y_p[p]);

   var Assign {CP_PAIRS} binary;
   min TotalDistance = sum {<c,p> in CP_PAIRS} dist[c,p] * Assign[c,p];
   con OneManagerPerCustomer {c in CUSTOMERS}:
      sum {<(c),p> in CP_PAIRS} Assign[c,p] = 1;
   con Cardinality {p in PORTFOLIO_MANAGERS}:
      sum {<c,(p)> in CP_PAIRS} Assign[c,p] <= (if position[p] = 'Manager' then 3 else 5);

   solve;
   print Assign;
quit;

SAS Output

Solution Summary
Solver	MILP
Algorithm	Branch and Cut
Objective Function	TotalDistance
Solution Status	Optimal
Objective Value	196

Relative Gap	0
Absolute Gap	0
Primal Infeasibility	0
Bound Infeasibility	0
Integer Infeasibility	0

Best Bound	196
Nodes	0
Iterations	0
Presolve Time	0.00
Solution Time	0.00

Assign
	A	B	C	D
David	0		1
Predy		0		1
Raj		0		1
Riya	1		0
Sam	0		1
Sompa		1		0

RobPratt · Posted 11-08-2016 09:20 AM

I should also point out that your problem decomposes into two completely disjoint problems, according to sales and position. A simple way to exploit that structure is to use the decomposition algorithm to detect and solve the two subproblems in parallel:

   solve with MILP / decomp=(method=concomp);

Ujjawal · Posted 11-15-2016 12:02 PM

Thanks a ton @RobPratt for your solution. At the same time, i apologize for so late reply. Your solution works like a charm on small dataset. I have learnt a lot from your solution as i'm new to PROC OPTMODEL. In my dataset, i have around 30k customers and approx. 700 PMs (including both 'Analyst' and 'Manager'). It throws an error 'Out of memory during solution resolution'. I understand it creates a cartesian product of variables which is quite high. It's using 4 threads during processing. Is there any workaround to fine tune the performance? I can comprimise the solution for making it run. Would you advise any performance tips? To workaround this issue, i have split data into 4 parts based on their region and checking distance within each region separately and later combining all the datasets. But it's not an optimal solution for some of the customers whose location is in the border areas of the region. They might have a nearest distance in the other region. Thanks in anticipation! Any help would be highly appreciated!

RobPratt · Posted 11-15-2016 01:02 PM

The first two things to try are:
1. Use the SAS memsize option to increase the amount of memory available.
2. Split your data into two disjoint parts according to sales and position. Because those two problems are independent, such a split loses no optimality.