BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Alain38
Quartz | Level 8

Hello,

 

I noticed something weird: let say I run a proc optmodel for group 1 to 100, and that I obtain OPTIMAL solutions for all, except BEST_FEASIBLE for groups 5 and 25.

 

Now if I launch again the exact same proc optmodel with only the groups 5 and 25, I obtain OPTIMAL results for them!

 

Sometimes it's even more strange since I will obtain for example an OPTIMAL result for group 5, only if I run it after let say group 2...

 

So to obtain optimal results for every group, I need to relaunch the optimization process for a subset of groups, sometimes several times, with trial and error.

 

Why the result of a by-group is sometimes dependent on the result of the previous by-group? What options could solve this?

1 ACCEPTED SOLUTION

Accepted Solutions
josgri
SAS Employee

Short answer: you can get the optimal solution by adding soltype=0. By default it is 1.  Bestfeasible status only exists if an optimal solution has been found but a point was found earlier that is also feasible for the provided tolerance but its dual-optimality measure was not yet within tolerance.

 

Long answer why:

Bestfeasible versus optimal is a hard one to describe but was highly requested by users as the default.  Basically this option lives in the gap between analytic and numerical optimization.   

 

Suppose we minimize f(x) subject to c(x) = 0.

 

Let g(x) be the gradient of f(x) and J(x) the Jacobian of c(x).

 

The algorithm iteratively creates a list of points {(x1,y1), (x2, y2) , ... (xk, yk)}.  Loosely speaking we stop adding points to this list when

 

  1. c(xk) = 0
  2. g(xk) - J(xk)'*yk = 0

 

(at least to user provided tolerances) and declare optimal.  Together these conditions are called the first-order optimality conditions.  In exact arithmetic on convex problems we should have that 

 

f(xk) < min_{1<= j < k} {f(xj) : c(xj) = 0}

 

for all feasible points encountered in this list.  I.e. the last point has the best objective.  However, in numerical optimization it is not uncommon to find a point f(xj) with j < k where 

 

  • f(xj) < f(xk)
  • c(xj) = 0
  • but g(xj) - J(xj)'*yj is not yet 0

I.e. we could have stopped at xj if a yj was known that satisfied the dual optimality condition (2) above as well as condition (1). 

 

So which would the user want?

 

(a) A feasible point that has a better objective but non-optimal dual multipliers

(b) A fully optimal point whose objective is worse.

 

I understand that in analytic convex settings this makes no sense and seems like it shouldn't happen.  But in the numerical nonlinear world, it is pretty easy to create small examples where this happens with arbitrarily large gaps between f(xk) and f(xj).  And it is one of those cases where any fix has the property that the cure will be worse than the illness.  I.e. for a given problem you can fix it at the cost of making other problems that work wonderfully, break.  Not sure if that makes sense? It is a bit hard to describe without going into pdf/latex.

View solution in original post

12 REPLIES 12
Rick_SAS
SAS Super FREQ

You don't include your program, but something like this might occur if you are letting the program choose initial guesses randomly or if there is a random component to the solver. The random number stream is presumably reused between BY groups, which means that the set of random numbers used to solve the problem depends on the number of BY groups, the number of iterations within each BY-group, etc. 

 

This is just a guess. Post your program if you want a more informed opinion.

RobPratt
SAS Super FREQ
By default, each NLP solver call uses the current variable values as a starting solution. In your case, these values come from the previous solve. To avoid this behavior, use the NOINITVAR option in the PROC OPTMODEL statement or explicitly set the values before each solve.
Alain38
Quartz | Level 8

Thank you very much for your explanations! Here is a small working example:

 

 

data MVC;
      input  grp _1 _2 _3 _4 _5 _6 name $; 
   datalines;
2	0.001426101	0.001373928	0.001499921	0.000853808	0.000732604	0.001131657	_1
2	0.001373928	0.001491575	0.001604228	0.00094015	0.000836575	0.001257868	_2
2	0.001499921	0.001604228	0.001800986	0.000989169	0.000893931	0.001403105	_3
2	0.000853808	0.00094015	0.000989169	0.00082856	0.000629721	0.000877513	_4
2	0.000732604	0.000836575	0.000893931	0.000629721	0.000649809	0.00078795	_5
2	0.001131657	0.001257868	0.001403105	0.000877513	0.00078795	0.001276228	_6
3	0.001408027	0.001347115	0.001451531	0.000833388	0.000725207	0.001087888	_1
3	0.001347115	0.001457885	0.001550696	0.000912191	0.000823088	0.001209183	_2
3	0.001451531	0.001550696	0.001726184	0.000941497	0.000866587	0.00133465	_3
3	0.000833388	0.000912191	0.000941497	0.000806385	0.000620399	0.000834292	_4
3	0.000725207	0.000823088	0.000866587	0.000620399	0.00064782	0.000763326	_5
3	0.001087888	0.001209183	0.00133465	0.000834292	0.000763326	0.001213602	_6
4	0.001429224	0.001353777	0.001464237	0.000843381	0.000726821	0.001086287	_1
4	0.001353777	0.001456634	0.00155055	0.000915175	0.000820362	0.001200883	_2
4	0.001464237	0.00155055	0.001728677	0.000947293	0.000863554	0.00132404	_3
4	0.000843381	0.000915175	0.000947293	0.000811089	0.000621009	0.000833171	_4
4	0.000726821	0.000820362	0.000863554	0.000621009	0.000644818	0.000755667	_5
4	0.001086287	0.001200883	0.00132404	0.000833171	0.000755667	0.001195544	_6
;

data Predicted;
      input  grp _1 _2 _3 _4 _5 _6; 
   datalines;
2	0.04341	0.03845	0.05235	0.0088	0.02185	0.03547
3	0.04658	0.04025	0.04254	0.00115	0.01937	0.03196
4	0.02918	0.03295	0.03577	0.00836	0.0213	0.03342
;

data Targetvar;
      input grp Target_var; 
   datalines;
2	0.000848383
3	0.000867194
4	0.000864978
;

%let byvar = grp;
proc optmodel printlevel=0 NOINITVAR;
	set OBS;
	set GROUPS;
	set <str> ASSETS;			
	set <num,str> GROUPS_ASSETS;

	num grp{OBS};
	num MVC{GROUPS_ASSETS, ASSETS};					
	num E{GROUPS_ASSETS};
	num Target_var{GROUPS};

	read data MVC into ASSETS=[Name];
	read data MVC into OBS=[_N_] grp;
	read data MVC nomiss into GROUPS_ASSETS=[k=grp i=Name] {j in ASSETS} <MVC[k,i,j]=col(j)>;
	read data Predicted nomiss into [k=grp] {j in ASSETS} <E[k,j]=col(j)>;
	read data Targetvar into GROUPS=[grp] Target_var;

	set BYSET = setof {i in OBS} &byvar.[i];
	num by;

	set OBS_BY = {i in OBS: &byvar.[i] = by};
	set Assets_BY = setof {o in OBS_BY, <(grp[o]),a> in GROUPS_ASSETS} a;
	var W{Assets} >= 0;

	maximize Expected = sum{i in Assets_BY}W[i]*E[by,i];
	con sum{i in Assets_BY, j in Assets_BY}W[i]*MVC[by,i,j]*W[j] = Target_var[by];
	con BUDGET: sum{i in Assets_BY}W[i] = 1;

	num W_sol {GROUPS_ASSETS};
	str solstatus {BYSET};
	do by = BYSET;
		put by=;
		solve with NLP / solver=ACTIVESET;
		for {i in Assets_BY} W_sol[by,i] = W[i].sol;
		solstatus[by] = _solution_status_;
	end;

	create data Weights from [&byvar i] W=W_sol;
	create data Solstatusdata from [&byvar] solstatus;
quit;

grp3 is OPTIMAL with NOINITVAR and BEST_FEASIBLE without this option, but for grp4 that's the contrary: OPTIMAL when uses previous values and BEST_FEASIBLE when independent from previous optimization.

 

 

So I can launch it twice, once with NOINITVAR and once without, and keep the best solution but that's far from being optimal since I have several hundreds of hours of optimization process to run. I also run it with multiple solvers, and different options for FEASTOL, which leads to thousands of hours of optimization... (computer working 24/7 on this)

 

Do you see a less time-consuming solution?

RobPratt
SAS Super FREQ

I have four suggestions.

 

1. The variable W should be indexed by Assets_BY:

/*	var W{Assets} >= 0;*/
	var W{Assets_BY} >= 0;

Or if Assets_BY will always be the same as Assets, just use Assets everywhere.

 

2. Use the MULTISTART option in the SOLVE WITH NLP statement.  On my machine, this yielded OPTIMAL for all three groups.

 

3. Explicitly initialize the variables inside the loop, just before the solve.  For example, the following statements distribute the budget uniformly for the initial solution:

do by = BYSET;
   put by=;
   for {i in Assets_BY} W[i] = 1/card(Assets_BY);

4. You might see better performance if you relax the nonlinear constraint from equality to inequality:

/*	con sum{i in Assets_BY, j in Assets_BY}W[i]*MVC[by,i,j]*W[j] = Target_var[by];*/
	con sum{i in Assets_BY, j in Assets_BY}W[i]*MVC[by,i,j]*W[j] <= Target_var[by];
Alain38
Quartz | Level 8

Thank you very much for these suggestions! When I use the MULTISTART option, it often fails, and with this example it failed for the 3 groups for me.

 

For grp = 2:

WARNING: A default random number seed was generated by Multistart. To reproduce the solution of the current
         run, specify SEED=9136841 in a subsequent SOLVE statement.

 

%let byvar = grp;
proc optmodel printlevel=0;
	set OBS;
	set GROUPS;
	set <str> ASSETS;			
	set <num,str> GROUPS_ASSETS;

	num grp{OBS};
	num MVC{GROUPS_ASSETS, ASSETS};					
	num E{GROUPS_ASSETS};
	num Target_var{GROUPS};

	read data MVC into ASSETS=[Name];
	read data MVC into OBS=[_N_] grp;
	read data MVC nomiss into GROUPS_ASSETS=[k=grp i=Name] {j in ASSETS} <MVC[k,i,j]=col(j)>;
	read data Predicted nomiss into [k=grp] {j in ASSETS} <E[k,j]=col(j)>;
	read data Targetvar into GROUPS=[grp] Target_var;

	set BYSET = setof {i in OBS} &byvar.[i];
	num by;

	set OBS_BY = {i in OBS: &byvar.[i] = by};
	set Assets_BY = setof {o in OBS_BY, <(grp[o]),a> in GROUPS_ASSETS} a;
	var W{Assets_BY} >= 0;

	max Expected = sum{i in Assets_BY}W[i]*E[by,i];
	con sum{i in Assets_BY, j in Assets_BY}W[i]*MVC[by,i,j]*W[j] <= Target_var[by];
	con BUDGET: sum{i in Assets_BY}W[i] = 1;

	num W_sol {GROUPS_ASSETS};
	str solstatus {BYSET};
	do by = BYSET;
		put by=;
		for {i in Assets_BY} W[i] = 1/card(Assets_BY);
		solve with NLP / solver=ACTIVESET MULTISTART;
		for {i in Assets_BY} W_sol[by,i] = W[i].sol;
		solstatus[by] = _solution_status_;
	end;

	create data Weights from [&byvar i] W=W_sol;
	create data Solstatusdata from [&byvar] solstatus;
quit;

For this example, I obtain OPTIMAL for each group when I use FEASTOL=1E-7, but with the full sample I need to use different values for FEASTOL, sometimes MULTISTART=(maxstarts=2) and different solvers (INTERIORPOINT, ACTIVESET and even "NLPC / tech=quanew") to obtain OPTIMAL for each group.

RobPratt
SAS Super FREQ

What version of SAS/OR are you running?

Alain38
Quartz | Level 8

9.4TS1M5 with SAS/OR 14.3

josgri
SAS Employee

Short answer: you can get the optimal solution by adding soltype=0. By default it is 1.  Bestfeasible status only exists if an optimal solution has been found but a point was found earlier that is also feasible for the provided tolerance but its dual-optimality measure was not yet within tolerance.

 

Long answer why:

Bestfeasible versus optimal is a hard one to describe but was highly requested by users as the default.  Basically this option lives in the gap between analytic and numerical optimization.   

 

Suppose we minimize f(x) subject to c(x) = 0.

 

Let g(x) be the gradient of f(x) and J(x) the Jacobian of c(x).

 

The algorithm iteratively creates a list of points {(x1,y1), (x2, y2) , ... (xk, yk)}.  Loosely speaking we stop adding points to this list when

 

  1. c(xk) = 0
  2. g(xk) - J(xk)'*yk = 0

 

(at least to user provided tolerances) and declare optimal.  Together these conditions are called the first-order optimality conditions.  In exact arithmetic on convex problems we should have that 

 

f(xk) < min_{1<= j < k} {f(xj) : c(xj) = 0}

 

for all feasible points encountered in this list.  I.e. the last point has the best objective.  However, in numerical optimization it is not uncommon to find a point f(xj) with j < k where 

 

  • f(xj) < f(xk)
  • c(xj) = 0
  • but g(xj) - J(xj)'*yj is not yet 0

I.e. we could have stopped at xj if a yj was known that satisfied the dual optimality condition (2) above as well as condition (1). 

 

So which would the user want?

 

(a) A feasible point that has a better objective but non-optimal dual multipliers

(b) A fully optimal point whose objective is worse.

 

I understand that in analytic convex settings this makes no sense and seems like it shouldn't happen.  But in the numerical nonlinear world, it is pretty easy to create small examples where this happens with arbitrarily large gaps between f(xk) and f(xj).  And it is one of those cases where any fix has the property that the cure will be worse than the illness.  I.e. for a given problem you can fix it at the cost of making other problems that work wonderfully, break.  Not sure if that makes sense? It is a bit hard to describe without going into pdf/latex.

Alain38
Quartz | Level 8

Thank you very much for this explanation, I think I got the idea and it is a lot more clearer now.

Alain38
Quartz | Level 8

Could you please tell me how to save in a dataset the optimal objective and the objective of the best feasible solution found?

(like the solution status is saved with "str solstatus {BYSET};", "solstatus[by] = _solution_status_" and "create data Solstatusdata from [&byvar] solstatus;")

 

It now seems to me more relevant to use these two informations to select the best solution, rather than the solution status.

josgri
SAS Employee

Hmm ... that is a great idea.  At the time the option was supported we only had the ability to return a single solution.  We can now return multiple solutions I believe, but we would need to add support in C  to do this.  We will for sure consider that option for a future release.  

 

Unfortunately there is no way to return both.  One option is conditionally call solve a second time (not wasted work--see below):

 

num W_sol {GROUPS_ASSETS};
num W_sol2 {GROUPS_ASSETS};
str solstatus {BYSET};
str solstatus2 {BYSET};
do by = BYSET;
    put by=;
    solve with NLP / solver=ACTIVESET;
    for {i in Assets_BY} W_sol[by,i] = W[i].sol;
    solstatus[by] = _solution_status_; 
    if _solution_status_ = 'BEST_FEASIBLE' then do;
         put "Solving a second time";
         solve with NLP / solver=ACTIVESET soltype=0;
         for {i in Assets_BY} W_sol2[by,i] = W[i].sol;
         solstatus2[by] = _solution_status_;
     end;		
end;

It is not necessarily extra unused work if you remove this option:  NOINITVAR

 

In that case you would be refining the best feasible point in an attempt to make it optimal as well.  But by setting soltype=0 on the second solve you ensure you do not need a loop.  The second solve will only occur if the first is best feasible and will always return the optimal point found.

 

I will let the team know your idea about returning both solutions at the same time--thanks!

Alain38
Quartz | Level 8

Thank you very much, I like this solution! Looking forward testing it Smiley Happy

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 12 replies
  • 1252 views
  • 9 likes
  • 4 in conversation