OR Problem

Ksharp · Posted 05-01-2017 06:54 AM

@Rick_SAS post a optimizal problem. but he solve it by SAS/QC.

@RobPratt want give it a try ?

http://blogs.sas.com/content/iml/2017/05/01/split-data-groups-mean-variance.html

Rick_SAS · Posted 05-01-2017 07:39 AM

Just FYI, I always alert Rob to any blog posts that involve optimization. If you see a paragraph at the end of one of my posts that says, "and you can also solve this with PROC OPTMODEL...", that information probably came from Rob.

RobPratt · Posted 05-01-2017 10:14 AM

Splitting into groups that have the same sum (or minimize the maximum sum) is straightforward for the MILP solver in PROC OPTMODEL, but if you want the same mean the problem becomes mixed integer nonlinear programming (MINLP), even without considering the standard deviation. PROC OPTMODEL does not yet provide an MINLP solver, and so far the most natural workarounds don’t look promising.

Ksharp · Posted 05-02-2017 08:25 AM

@RobPratt,
So Could you write some OR code to solve this problem ?

And I also want know whether OR could solve this problem 
when the dataset has one million obs

RobPratt · Posted 05-02-2017 03:07 PM

Here is PROC OPTMODEL code that uses the MILP solver to minimize the range between maximum and minimum sums and forces the treatments to have equal or nearly equal numbers of units but ignores standard deviation:

proc optmodel;
   set TREATMENTS;
   read data Treatments into TREATMENTS=[Trt];

   set UNITS;
   num &Var {UNITS};
   read data Units(where=(&Var ne .)) into UNITS=[_N_] &Var;

   var Assign {UNITS, TREATMENTS} binary;
   con AssignOnce {u in UNITS}:
      sum {t in TREATMENTS} Assign[u,t] = 1;
   impvar TreatmentSum {t in TREATMENTS} = sum {u in UNITS} &Var.[u] * Assign[u,t];
   con Cardinality {t in TREATMENTS}:
      floor(card(UNITS) / card(TREATMENTS)) 
   <= sum {u in UNITS} Assign[u,t] 
   <= ceil(card(UNITS) / card(TREATMENTS));
   var MinSum;
   var MaxSum;
   con MinSumCon {t in TREATMENTS}:
      MinSum <= TreatmentSum[t];
   con MaxSumCon {t in TREATMENTS}:
      MaxSum >= TreatmentSum[t];
   min Objective = MaxSum - MinSum;

   solve with MILP / decomp;
   print TreatmentSum;

   num Trt {u in UNITS} = round(sum {t in TREATMENTS} t * Assign[u,t].sol);
   create data Groups from [unit] &Var Trt;
quit;

proc means data=Groups mean std;
  class Trt;
  var &Var;
run;

proc sgplot data=Groups;
   vbox &Var / category=Trt;
run;

It takes about a minute and yields the following results:

SAS Output

[1]	TreatmentSum
1	139169
2	139169
3	139169
4	139168
5	139169

The MEANS Procedure

Analysis Variable : Cholesterol
Trt	N Obs	Mean	Std Dev
1	627	221.9601276	29.7794342
2	627	221.9601276	41.9241290
3	627	221.9601276	42.8058251
4	627	221.9585327	42.2811774
5	627	221.9601276	54.9869575

You can also perform an exact linearization to handle means instead of sums, but the resulting problem is much harder. See this blog post from yesterday.

Ksharp · Posted 05-03-2017 10:26 AM

@RobPratt

Here is my GA .and got the similar result with @Rick_SAS

data Units;            
set Sashelp.Heart(where=(status = "Alive"));
keep Sex AgeAtStart Height Weight Diastolic Systolic MRW Cholesterol;
run;


proc iml;
use Units nobs nobs;
read all var {Cholesterol};
close;

start function(x) global(Cholesterol,nobs,group,mean_std);
if countunique(x)=group then do;
do i=1 to group;
 idx=loc(x=i);
 temp=Cholesterol[idx]; 
 mean_std[i,1]=mean(temp);
 mean_std[i,2]=std(temp);
end;
sse=sum(mean_std[<>,]-mean_std[><,]);
end;
else sse=999999;
return (sse);
finish;

group=5;  /* <--Change it(divide into 5 groups)*/
mean_std=j(group,2,.);

encoding=j(2,nobs,1);
encoding[2,]=group;    

id=gasetup(2,nobs,123456789);
call gasetobj(id,0,"function");
call gasetsel(id,10,1,1);
call gainit(id,1000,encoding);


niter = 100;
do i = 1 to niter;
 call garegen(id);
 call gagetval(value, id);
end;
call gagetmem(mem, value, id, 1);

col_mem=t(mem);
create group var {col_mem};
append;
close;

print value[l = "Min Value:"] ;
call gaend(id);
quit;
data want;
 merge group Units(keep=Cholesterol);
run;
proc means data=want mean std;
  class col_mem;
  var Cholesterol;
run;

The MEANS Procedure

Analysis Variable : Cholesterol
COL_MEM	N Obs	Mean	Std Dev
1	630	221.9544715	43.0901775
2	662	221.9627907	43.1227683
3	629	221.9457237	43.0978582
4	648	221.9667722	43.0939030
5	649	221.9685039	43.0944852

Ksharp · Posted 08-12-2017 12:52 AM

@RobPratt

I noticed there are some missing value in ,so I fixed it again and looks like better than Rick's result.

data Units;            
set Sashelp.Heart(where=(status = "Alive" and Cholesterol is not missing));
n+1;
keep n Sex AgeAtStart Height Weight Diastolic Systolic MRW Cholesterol;
run;
 
proc iml;
use Units nobs nobs;
read all var {Cholesterol};
close;
 
start function(x) global(Cholesterol,group,mean_std);
xx=shape(x,group,0,.);
do i=1 to group;
 idx=setdif(xx[i,],{.});
 temp=Cholesterol[idx]; 
 mean_std[i,1]=mean(temp);
 mean_std[i,2]=std(temp);
end;
sse=sum(mean_std[<>,]-mean_std[><,]);
return (sse);
finish;
 
group=5;  /* <--Change it(divide into 5 groups)*/
mean_std=j(group,2,.);
  
id=gasetup(3,nobs,123456789);
call gasetobj(id,0,"function");
call gasetsel(id,10,1,1);
call gainit(id,1000);
 
niter = 100;
do i = 1 to niter;
 call garegen(id);
 call gagetval(value, id);
end;
call gagetmem(mem, value, id, 1);
 
col_mem=shape(mem,group,0,.);

create group from col_mem;
append from col_mem;
close;
 
print value[l = "Min Value:"] ;
call gaend(id);
quit;

data group;
 set group;
 group+1;
run;
proc transpose data=group out=member(drop=_: index=(col1));
by group;
var col:;
run;
 
data want;
 merge member(rename=(col1=n) where=(n is not missing)) Units(keep=n Cholesterol);
 by n;
run;
proc means data=want mean std;
  class group;
  var Cholesterol;
run;

OUTPUT:

The MEANS Procedure

Analysis Variable : Cholesterol
group	N Obs	Mean	Std Dev
1	627	221.9649123	43.1137738
2	627	221.9665072	43.1220553
3	627	221.9505582	43.0823468
4	627	221.9569378	43.0712286
5	627	221.9601276	43.1105645

OR Problem

Re: OR Problem

Re: OR Problem

Re: OR Problem

Re: OR Problem

Re: OR Problem

Re: OR Problem

Catch up on SAS Innovate 2026