Hello,
I noticed something weird: let say I run a proc optmodel for group 1 to 100, and that I obtain OPTIMAL solutions for all, except BEST_FEASIBLE for groups 5 and 25.
Now if I launch again the exact same proc optmodel with only the groups 5 and 25, I obtain OPTIMAL results for them!
Sometimes it's even more strange since I will obtain for example an OPTIMAL result for group 5, only if I run it after let say group 2...
So to obtain optimal results for every group, I need to relaunch the optimization process for a subset of groups, sometimes several times, with trial and error.
Why the result of a by-group is sometimes dependent on the result of the previous by-group? What options could solve this?
Short answer: you can get the optimal solution by adding soltype=0. By default it is 1. Bestfeasible status only exists if an optimal solution has been found but a point was found earlier that is also feasible for the provided tolerance but its dual-optimality measure was not yet within tolerance.
Long answer why:
Bestfeasible versus optimal is a hard one to describe but was highly requested by users as the default. Basically this option lives in the gap between analytic and numerical optimization.
Suppose we minimize f(x) subject to c(x) = 0.
Let g(x) be the gradient of f(x) and J(x) the Jacobian of c(x).
The algorithm iteratively creates a list of points {(x1,y1), (x2, y2) , ... (xk, yk)}. Loosely speaking we stop adding points to this list when
(at least to user provided tolerances) and declare optimal. Together these conditions are called the first-order optimality conditions. In exact arithmetic on convex problems we should have that
f(xk) < min_{1<= j < k} {f(xj) : c(xj) = 0}
for all feasible points encountered in this list. I.e. the last point has the best objective. However, in numerical optimization it is not uncommon to find a point f(xj) with j < k where
I.e. we could have stopped at xj if a yj was known that satisfied the dual optimality condition (2) above as well as condition (1).
So which would the user want?
(a) A feasible point that has a better objective but non-optimal dual multipliers
(b) A fully optimal point whose objective is worse.
I understand that in analytic convex settings this makes no sense and seems like it shouldn't happen. But in the numerical nonlinear world, it is pretty easy to create small examples where this happens with arbitrarily large gaps between f(xk) and f(xj). And it is one of those cases where any fix has the property that the cure will be worse than the illness. I.e. for a given problem you can fix it at the cost of making other problems that work wonderfully, break. Not sure if that makes sense? It is a bit hard to describe without going into pdf/latex.
You don't include your program, but something like this might occur if you are letting the program choose initial guesses randomly or if there is a random component to the solver. The random number stream is presumably reused between BY groups, which means that the set of random numbers used to solve the problem depends on the number of BY groups, the number of iterations within each BY-group, etc.
This is just a guess. Post your program if you want a more informed opinion.
Thank you very much for your explanations! Here is a small working example:
data MVC;
input grp _1 _2 _3 _4 _5 _6 name $;
datalines;
2 0.001426101 0.001373928 0.001499921 0.000853808 0.000732604 0.001131657 _1
2 0.001373928 0.001491575 0.001604228 0.00094015 0.000836575 0.001257868 _2
2 0.001499921 0.001604228 0.001800986 0.000989169 0.000893931 0.001403105 _3
2 0.000853808 0.00094015 0.000989169 0.00082856 0.000629721 0.000877513 _4
2 0.000732604 0.000836575 0.000893931 0.000629721 0.000649809 0.00078795 _5
2 0.001131657 0.001257868 0.001403105 0.000877513 0.00078795 0.001276228 _6
3 0.001408027 0.001347115 0.001451531 0.000833388 0.000725207 0.001087888 _1
3 0.001347115 0.001457885 0.001550696 0.000912191 0.000823088 0.001209183 _2
3 0.001451531 0.001550696 0.001726184 0.000941497 0.000866587 0.00133465 _3
3 0.000833388 0.000912191 0.000941497 0.000806385 0.000620399 0.000834292 _4
3 0.000725207 0.000823088 0.000866587 0.000620399 0.00064782 0.000763326 _5
3 0.001087888 0.001209183 0.00133465 0.000834292 0.000763326 0.001213602 _6
4 0.001429224 0.001353777 0.001464237 0.000843381 0.000726821 0.001086287 _1
4 0.001353777 0.001456634 0.00155055 0.000915175 0.000820362 0.001200883 _2
4 0.001464237 0.00155055 0.001728677 0.000947293 0.000863554 0.00132404 _3
4 0.000843381 0.000915175 0.000947293 0.000811089 0.000621009 0.000833171 _4
4 0.000726821 0.000820362 0.000863554 0.000621009 0.000644818 0.000755667 _5
4 0.001086287 0.001200883 0.00132404 0.000833171 0.000755667 0.001195544 _6
;
data Predicted;
input grp _1 _2 _3 _4 _5 _6;
datalines;
2 0.04341 0.03845 0.05235 0.0088 0.02185 0.03547
3 0.04658 0.04025 0.04254 0.00115 0.01937 0.03196
4 0.02918 0.03295 0.03577 0.00836 0.0213 0.03342
;
data Targetvar;
input grp Target_var;
datalines;
2 0.000848383
3 0.000867194
4 0.000864978
;
%let byvar = grp;
proc optmodel printlevel=0 NOINITVAR;
set OBS;
set GROUPS;
set <str> ASSETS;
set <num,str> GROUPS_ASSETS;
num grp{OBS};
num MVC{GROUPS_ASSETS, ASSETS};
num E{GROUPS_ASSETS};
num Target_var{GROUPS};
read data MVC into ASSETS=[Name];
read data MVC into OBS=[_N_] grp;
read data MVC nomiss into GROUPS_ASSETS=[k=grp i=Name] {j in ASSETS} <MVC[k,i,j]=col(j)>;
read data Predicted nomiss into [k=grp] {j in ASSETS} <E[k,j]=col(j)>;
read data Targetvar into GROUPS=[grp] Target_var;
set BYSET = setof {i in OBS} &byvar.[i];
num by;
set OBS_BY = {i in OBS: &byvar.[i] = by};
set Assets_BY = setof {o in OBS_BY, <(grp[o]),a> in GROUPS_ASSETS} a;
var W{Assets} >= 0;
maximize Expected = sum{i in Assets_BY}W[i]*E[by,i];
con sum{i in Assets_BY, j in Assets_BY}W[i]*MVC[by,i,j]*W[j] = Target_var[by];
con BUDGET: sum{i in Assets_BY}W[i] = 1;
num W_sol {GROUPS_ASSETS};
str solstatus {BYSET};
do by = BYSET;
put by=;
solve with NLP / solver=ACTIVESET;
for {i in Assets_BY} W_sol[by,i] = W[i].sol;
solstatus[by] = _solution_status_;
end;
create data Weights from [&byvar i] W=W_sol;
create data Solstatusdata from [&byvar] solstatus;
quit;
grp3 is OPTIMAL with NOINITVAR and BEST_FEASIBLE without this option, but for grp4 that's the contrary: OPTIMAL when uses previous values and BEST_FEASIBLE when independent from previous optimization.
So I can launch it twice, once with NOINITVAR and once without, and keep the best solution but that's far from being optimal since I have several hundreds of hours of optimization process to run. I also run it with multiple solvers, and different options for FEASTOL, which leads to thousands of hours of optimization... (computer working 24/7 on this)
Do you see a less time-consuming solution?
I have four suggestions.
1. The variable W should be indexed by Assets_BY:
/* var W{Assets} >= 0;*/
var W{Assets_BY} >= 0;
Or if Assets_BY will always be the same as Assets, just use Assets everywhere.
2. Use the MULTISTART option in the SOLVE WITH NLP statement. On my machine, this yielded OPTIMAL for all three groups.
3. Explicitly initialize the variables inside the loop, just before the solve. For example, the following statements distribute the budget uniformly for the initial solution:
do by = BYSET;
put by=;
for {i in Assets_BY} W[i] = 1/card(Assets_BY);
4. You might see better performance if you relax the nonlinear constraint from equality to inequality:
/* con sum{i in Assets_BY, j in Assets_BY}W[i]*MVC[by,i,j]*W[j] = Target_var[by];*/
con sum{i in Assets_BY, j in Assets_BY}W[i]*MVC[by,i,j]*W[j] <= Target_var[by];
Thank you very much for these suggestions! When I use the MULTISTART option, it often fails, and with this example it failed for the 3 groups for me.
For grp = 2:
WARNING: A default random number seed was generated by Multistart. To reproduce the solution of the current
run, specify SEED=9136841 in a subsequent SOLVE statement.
%let byvar = grp;
proc optmodel printlevel=0;
set OBS;
set GROUPS;
set <str> ASSETS;
set <num,str> GROUPS_ASSETS;
num grp{OBS};
num MVC{GROUPS_ASSETS, ASSETS};
num E{GROUPS_ASSETS};
num Target_var{GROUPS};
read data MVC into ASSETS=[Name];
read data MVC into OBS=[_N_] grp;
read data MVC nomiss into GROUPS_ASSETS=[k=grp i=Name] {j in ASSETS} <MVC[k,i,j]=col(j)>;
read data Predicted nomiss into [k=grp] {j in ASSETS} <E[k,j]=col(j)>;
read data Targetvar into GROUPS=[grp] Target_var;
set BYSET = setof {i in OBS} &byvar.[i];
num by;
set OBS_BY = {i in OBS: &byvar.[i] = by};
set Assets_BY = setof {o in OBS_BY, <(grp[o]),a> in GROUPS_ASSETS} a;
var W{Assets_BY} >= 0;
max Expected = sum{i in Assets_BY}W[i]*E[by,i];
con sum{i in Assets_BY, j in Assets_BY}W[i]*MVC[by,i,j]*W[j] <= Target_var[by];
con BUDGET: sum{i in Assets_BY}W[i] = 1;
num W_sol {GROUPS_ASSETS};
str solstatus {BYSET};
do by = BYSET;
put by=;
for {i in Assets_BY} W[i] = 1/card(Assets_BY);
solve with NLP / solver=ACTIVESET MULTISTART;
for {i in Assets_BY} W_sol[by,i] = W[i].sol;
solstatus[by] = _solution_status_;
end;
create data Weights from [&byvar i] W=W_sol;
create data Solstatusdata from [&byvar] solstatus;
quit;
For this example, I obtain OPTIMAL for each group when I use FEASTOL=1E-7, but with the full sample I need to use different values for FEASTOL, sometimes MULTISTART=(maxstarts=2) and different solvers (INTERIORPOINT, ACTIVESET and even "NLPC / tech=quanew") to obtain OPTIMAL for each group.
What version of SAS/OR are you running?
9.4TS1M5 with SAS/OR 14.3
Short answer: you can get the optimal solution by adding soltype=0. By default it is 1. Bestfeasible status only exists if an optimal solution has been found but a point was found earlier that is also feasible for the provided tolerance but its dual-optimality measure was not yet within tolerance.
Long answer why:
Bestfeasible versus optimal is a hard one to describe but was highly requested by users as the default. Basically this option lives in the gap between analytic and numerical optimization.
Suppose we minimize f(x) subject to c(x) = 0.
Let g(x) be the gradient of f(x) and J(x) the Jacobian of c(x).
The algorithm iteratively creates a list of points {(x1,y1), (x2, y2) , ... (xk, yk)}. Loosely speaking we stop adding points to this list when
(at least to user provided tolerances) and declare optimal. Together these conditions are called the first-order optimality conditions. In exact arithmetic on convex problems we should have that
f(xk) < min_{1<= j < k} {f(xj) : c(xj) = 0}
for all feasible points encountered in this list. I.e. the last point has the best objective. However, in numerical optimization it is not uncommon to find a point f(xj) with j < k where
I.e. we could have stopped at xj if a yj was known that satisfied the dual optimality condition (2) above as well as condition (1).
So which would the user want?
(a) A feasible point that has a better objective but non-optimal dual multipliers
(b) A fully optimal point whose objective is worse.
I understand that in analytic convex settings this makes no sense and seems like it shouldn't happen. But in the numerical nonlinear world, it is pretty easy to create small examples where this happens with arbitrarily large gaps between f(xk) and f(xj). And it is one of those cases where any fix has the property that the cure will be worse than the illness. I.e. for a given problem you can fix it at the cost of making other problems that work wonderfully, break. Not sure if that makes sense? It is a bit hard to describe without going into pdf/latex.
Thank you very much for this explanation, I think I got the idea and it is a lot more clearer now.
Could you please tell me how to save in a dataset the optimal objective and the objective of the best feasible solution found?
(like the solution status is saved with "str solstatus {BYSET};", "solstatus[by] = _solution_status_" and "create data Solstatusdata from [&byvar] solstatus;")
It now seems to me more relevant to use these two informations to select the best solution, rather than the solution status.
Hmm ... that is a great idea. At the time the option was supported we only had the ability to return a single solution. We can now return multiple solutions I believe, but we would need to add support in C to do this. We will for sure consider that option for a future release.
Unfortunately there is no way to return both. One option is conditionally call solve a second time (not wasted work--see below):
num W_sol {GROUPS_ASSETS}; num W_sol2 {GROUPS_ASSETS}; str solstatus {BYSET}; str solstatus2 {BYSET}; do by = BYSET; put by=; solve with NLP / solver=ACTIVESET; for {i in Assets_BY} W_sol[by,i] = W[i].sol; solstatus[by] = _solution_status_; if _solution_status_ = 'BEST_FEASIBLE' then do; put "Solving a second time"; solve with NLP / solver=ACTIVESET soltype=0; for {i in Assets_BY} W_sol2[by,i] = W[i].sol; solstatus2[by] = _solution_status_; end; end;
It is not necessarily extra unused work if you remove this option: NOINITVAR
In that case you would be refining the best feasible point in an attempt to make it optimal as well. But by setting soltype=0 on the second solve you ensure you do not need a loop. The second solve will only occur if the first is best feasible and will always return the optimal point found.
I will let the team know your idea about returning both solutions at the same time--thanks!
Thank you very much, I like this solution! Looking forward testing it
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.