Hi! I am running a simple program to create "bins" -- contiguous groups of an ordered variable. There is a binary outcome variable, and the objective is to minimize the corrected sum of squares within each bin. I tried to upload a small SAS data set of 83 points (already aggregated the zeroes and ones for each distinct value of the ordered variable), but the site kept telling me "The contents of the attachment doesn't match its file type," so I have uploaded it as .csv instead. In this case we are creating seven groups. Here is my code: *********************************************************************************************************; %let wlib = work ; %let blib = ; %* input library ; %let solid = om02 ; %let numBins = 7; %let size_lb = 1000; %let size_ub = 10000; proc optmodel; set POINTS; num nv_cum_all {POINTS}; num nv_cum_one {POINTS}; num nv_ind_all {POINTS}; num nv_ind_one {POINTS}; read data &blib..Agg_sort_pred_var_120 into POINTS=[nv_obs_num] nv_cum_all nv_cum_one nv_ind_all nv_ind_one; set ARCS init {i in POINTS, j in POINTS: i < j}; /* arc <i,j> covers points {i,...,j} */ num size {<i,j> in ARCS} = nv_cum_all[j] - (if i-1 in POINTS then nv_cum_all[i-1]); num ones {<i,j> in ARCS} = (nv_cum_one[j] - (if i-1 in POINTS then nv_cum_one[i-1])) ; num rate {<i,j> in ARCS} = ones[i,j] / size[i,j]; num sum_all = sum {i in POINTS} nv_ind_all[i] ; num sum_one = sum {i in POINTS} nv_ind_one[i] ; num exp_rate = (sum_one / sum_all) ; ARCS = {<i,j> in ARCS: &size_lb. <= size[i,j] <= &size_ub.}; num sos {<i,j> in ARCS} = ((size[i,j] - ones[i,j]) * rate[i,j]) ; var Pairs {ARCS} binary; min Objective = sum {<i,j> in ARCS} (sos[i,j] * Pairs[i,j]) ; con NumBins: sum {<i,j> in ARCS} Pairs[i,j] = &numBins.; con Coverage {p in POINTS}: sum {<i,j> in ARCS: i le p le j} Pairs[i,j] = 1 ; solve; set SUPPORT = {<i,j> in ARCS: Pairs[i,j].sol > 0.5}; print {<i,j> in SUPPORT} Pairs; create data &wlib..solution&solid. from [i j]=SUPPORT bin=('C'||i||'_'||(j)) size sos rate; save mps &wlib..mps&solid. ; quit; data &wlib..solution&solid.; set &wlib..solution&solid.; diff = rate - lag(rate); run; proc print; run; *********************************************************************************************************; Here is the "Optimal within Relative Gap" solution I obtained from SAS/OR 14.1 in SAS 9.4 on a Linux x64 grid server (actually this output comes from OPTMILP, but it's the same solution): Obs _OBJ_ID_ _RHS_ID_ _VAR_ _TYPE_ _OBJCOEF_ _LBOUND_ _UBOUND_ _VALUE_ obs_num 1 Objective .RHS. Pairs[1,14] B 341.69 0 1 1 6 2 Objective .RHS. Pairs[15,40] B 1760.96 0 1 1 452 3 Objective .RHS. Pairs[41,44] B 240.84 0 1 1 1173 4 Objective .RHS. Pairs[45,52] B 485.65 0 1 1 1294 5 Objective .RHS. Pairs[53,55] B 318.08 0 1 1 1524 6 Objective .RHS. Pairs[56,73] B 1271.36 0 1 1 1625 7 Objective .RHS. Pairs[74,83] B 661.05 0 1 1 1911 The sum of _OBJCOEF_ = 5079.63 Here is a slightly better solution (I think it's optimal) with sum of _OBJCOEF_ = 5079.62; three of the seven groups are identical: Obs _OBJ_ID_ _RHS_ID_ _VAR_ _TYPE_ _OBJCOEF_ _LBOUND_ _UBOUND_ _VALUE_ obs_num 1 Objective .RHS. Pairs[1,14] B 341.69 1 1 1 6 2 Objective .RHS. Pairs[15,21] B 344.32 1 1 1 433 3 Objective .RHS. Pairs[22,24] B 247.30 1 1 1 633 4 Objective .RHS. Pairs[25,40] B 1167.22 1 1 1 731 5 Objective .RHS. Pairs[41,44] B 240.84 1 1 1 1173 6 Objective .RHS. Pairs[45,73] B 2077.20 1 1 1 1315 7 Objective .RHS. Pairs[74,83] B 661.05 1 1 1 1911 QUESTION: How do I "tweak" PROC OPTMODEL to obtain the better solution from the same code? Thanks!
... View more