BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
harmonic
Quartz | Level 8

Hello community,

 

I am starting using Proc Opt model, and I would like to minimize a function like this one:

min|y|=B1*type + B2*b + B3*c +  B4*d + B5*e + B6

 

Where values are taken from a dataset, here the values:

Y should be an integer number and this is the distribution

harmonic_0-1739351362386.png


B1, B2, B3, B4, B5 and B6 are constants

'type' should be a string with values "white", "black"

Should I have different results for each string?

'b' should be with values 2 and 12

 

'c' should be in a float value in this distribution 

harmonic_1-1739351596501.png

 

'd' should be in an interval 1 to 48 integer values

 

'e' should be with values 0, 2 or 12

Thank you in advance
@RobPratt @Ksharp 

1 ACCEPTED SOLUTION

Accepted Solutions
harmonic
Quartz | Level 8

Yes but I would like to use the read data after I imported my xlsx, is that possible?

 

View solution in original post

19 REPLIES 19
RobPratt
SAS Super FREQ

I don't quite understand your problem.  Do you want to solve a regression problem where b, c, d, and e are regression coefficients?  If B1 is a constant and type is a string, what does B1*type mean?  Are you able to share example data and also an example (not necessarily optimal) solution?

harmonic
Quartz | Level 8

Yes,

For example, I would like to minimize y in this dataset. I was planning to set a lower limit of y to 1 because I don’t want it to be zero, and I also need to include constraints that are not explicitly visible in my data. For instance, the values in column 'a' should be within the integer range from 6 to 48.

Could you provide an example of how to minimize a regression function for y using this dataset or by applying the constraints I want, instead of using one of my columns?

Type is an example of a string column where I should perform minimization for each string value, but for now, we can ignore it to simplify the problem.

 

RobPratt
SAS Super FREQ

The following PROC OPTMODEL code solves an ordinary least squares regression:

 

proc optmodel;
   /* declare sets and input parameters */
   set OBS;
   num y {OBS};
   num a {OBS};
   num b {OBS};
   num c {OBS};
   num d {OBS};
   num e {OBS};

   /* read input data set */
   read data input into OBS=[_N_] y a b c d e;

   /* declare decision variables */
   var Beta {0..5};
   var Error {OBS};
   impvar Estimate {i in OBS} = Beta[0] + Beta[1]*a[i] + Beta[2]*b[i] + Beta[3]*c[i] + Beta[4]*d[i] + Beta[5]*e[i];

   /* declare objective to minimize sum of squared errors */
   min SSE = sum {i in OBS} Error[i]^2;

   /* declare constraints */
   con ErrorCon {i in OBS}:
      Error[i] = Estimate[i] - y[i];

   /* call quadratic programming (QP) solver */
   solve;

   /* print regression coefficients */
   print Beta;

   /* create output data sets */
   create data BetaData from [j] Beta;
   create data EstimateData from [i] a b c d e y Estimate Error;
quit;

The resulting regression coefficients are:

 

[1] Beta
0 416.9191
1 19.3854
2 18.5124
3 -13.0827
4 -22.7077
5 -1.4499
 
Those are the same values that are returned by the following PROC REG call:
proc reg data=input;
   model y = a b c d e;
quit;
For the additional constraints, do you want to impose bounds on Beta[j]?
Or do you instead want to solve an optimization problem where a, b, c, d, and e are the (bounded) decision variables and you want to minimize the predicted value of y?  That is, with the fixed values of Beta[j] found earlier, do you want to minimize the linear function Beta[0] + Beta[1]*a + Beta[2]*b + Beta[3]*c + Beta[4]*d + Beta[5]*e subject to some constraints on a, b, c, d, and e?
harmonic
Quartz | Level 8

I don't want to find the betas, I already have the beta values. I would like to find the a, b, c , d , e values in an interval from my dataset or an interval that I will determine to minimize the y value. It's just an equation where my betas can be positive or negatives.

Ksharp
Super User

If I understood what you mean.

 

proc optmodel;
num a{1..2}=[2 12];
num b{1..3}=[2 12 24];
num d{i in 1..10}=i;
num e{i in 1..47}=i;
var var_a{1..2} binary;
var var_b{1..3} binary;
var var_c >=5 <=15;
var var_d{1..10} binary;
var var_e{1..47} binary;
impvar sum=-8*sum{i in 1..2} a[i]*var_a[i] +
           -2*sum{i in 1..3} b[i]*var_b[i] +
	       -2*var_c +
	       -4*sum{i in 1..10}d[i]*var_d[i] +
            2*sum{i in 1..47}e[i]*var_e[i] ;
min obj=sum;
con con_total:sum>=1;
con con_a:sum{i in 1..2} var_a[i]=1;
con con_b:sum{i in 1..3} var_b[i]=1;
con con_d:sum{i in 1..10}var_d[i]=1;
con con_e:sum{i in 1..47}var_e[i]=1;

solve ;

print a var_a ; 
print b var_b ;
print   var_c ;
print d var_d ;
print e var_e ;
quit;

 

Ksharp_0-1739438064032.png

Ksharp_1-1739438098006.png

Ksharp_2-1739438135663.png

 

 

RobPratt
SAS Super FREQ

Thanks for posting this reply.  +1

 

A slight improvement is that for variables d and e with consecutive integer domains, you can avoid num d, num e, con_d, and con_e by using integer variables:

   var var_d >= 1 <= 10 integer;
   var var_e >= 1 <= 47 integer;
harmonic
Quartz | Level 8

Thank you, I will try by today and tomorrow if it is work. If I would like to use steps of 4 within the interval? How can i change it?

 

RobPratt
SAS Super FREQ

Here's one way to enforce that x is in {2,6,10,...,42}:

   var x;
   set MYSET = 2..42 by 4;
   var y {MYSET} binary;
   con Mycon1: sum {i in MYSET} y[i] = 1;
   con Mycon2: sum {i in MYSET} i*y[i] = x;

Here's another way that uses fewer variables and constraints:

   var x >= 2 <= 42;
   var y integer;
   con Mycon: x = 2 + 4*y;

See also https://or.stackexchange.com/questions/6545/how-to-linearize-membership-in-a-finite-set

harmonic
Quartz | Level 8

How can I integrate that in the code of the proc optmodel above? @RobPratt 

RobPratt
SAS Super FREQ
Sorry, this thread is getting hard to follow. What do you want to integrate into which PROC OPTMODEL code?
harmonic
Quartz | Level 8

How can I insert these con

   var x >= 2 <= 42;
   var y integer;
   con Mycon: x = 2 + 4*y;


to this proc opt model

proc optmodel;
num a{1..2}=[2 12];
num b{1..3}=[2 12 24];
num d{i in 1..10}=i;
num e{i in 1..47}=i;
var var_a{1..2} binary;
var var_b{1..3} binary;
var var_c >=5 <=15;
var var_d{1..10} binary;
var var_e{1..47} binary;
impvar sum=-8*sum{i in 1..2} a[i]*var_a[i] +
           -2*sum{i in 1..3} b[i]*var_b[i] +
	       -2*var_c +
	       -4*sum{i in 1..10}d[i]*var_d[i] +
            2*sum{i in 1..47}e[i]*var_e[i] ;
min obj=sum;
con con_total:sum>=1;
con con_a:sum{i in 1..2} var_a[i]=1;
con con_b:sum{i in 1..3} var_b[i]=1;
con con_d:sum{i in 1..10}var_d[i]=1;
con con_e:sum{i in 1..47}var_e[i]=1;

solve ;

print a var_a ; 
print b var_b ;
print   var_c ;
print d var_d ;
print e var_e ;
quit;

@RobPratt 

Ksharp
Super User
proc optmodel;
num a{1..2}=[2 12];
num b{1..3}=[2 12 24];
num d{i in 1..10}=i;
num e{i in 1..47}=i;
var var_a{1..2} binary;
var var_b{1..3} binary;
var var_c >=2 <=42 ;
var var_d integer;
impvar sum= 4*sum{i in 1..2} a[i]*var_a[i] +
           -2*sum{i in 1..3} b[i]*var_b[i] +
	       -2*var_c +
	        4*var_d  ;
min obj=sum;
con con_total:sum>=1;
con con_a:sum{i in 1..2} var_a[i]=1;
con con_b:sum{i in 1..3} var_b[i]=1;
con con_d:var_c=2+4*var_d;

solve ;

print a var_a ; 
print b var_b ;
print   var_c ;
print   var_d ;
quit;

Ksharp_0-1739954638326.png

 

harmonic
Quartz | Level 8

If I would like to read the data from my dataset?

Ksharp
Super User

I don't understand. If you want to import these interval of variable from XLSX, you could make a macro variable (assuming the levels of these variable was not too many).

 

proc import datafile="C:\Users\xiakeshan\Downloads\input.xlsx" out=have dbms=xlsx replace;
run;
proc sql noprint;
select distinct a into :a separated by ' ' from have;
select count(distinct a) into :_a separated by ' ' from have;

select distinct b into :b separated by ' ' from have;
select count(distinct b) into :_b separated by ' ' from have;

select floor(min(c)) into :min_c from have;
select ceil(max(c)) into :max_c from have;

select distinct d into :d separated by ' ' from have;
select count(distinct d) into :_d separated by ' ' from have;

select distinct e into :e separated by ' ' from have;
select count(distinct e) into :_e separated by ' ' from have;

quit;

proc optmodel;
num a{1..&_a.}=[&a.];
num b{1..&_b.}=[&b.];
num d{1..&_d.}=[&d.];
num e{1..&_e.}=[&e.];
var var_a{1..&_a.} binary;
var var_b{1..&_b.} binary;
var var_c >=&min_c.  <=&max_c.;
var var_d{1..&_d.} binary;
var var_e{1..&_e.} binary;
impvar sum=-8*sum{i in 1..&_a.} a[i]*var_a[i] +
           -2*sum{i in 1..&_b.} b[i]*var_b[i] +
	       -2*var_c +
	       -4*sum{i in 1..&_d.}d[i]*var_d[i] +
            2*sum{i in 1..&_e.}e[i]*var_e[i] ;
min obj=sum;
con con_total:sum>=1;
con con_a:sum{i in 1..&_a.} var_a[i]=1;
con con_b:sum{i in 1..&_b.} var_b[i]=1;
con con_d:sum{i in 1..&_d.} var_d[i]=1;
con con_e:sum{i in 1..&_e.} var_e[i]=1;

solve ;

print a var_a ; 
print b var_b ;
print   var_c ;
print d var_d ;
print e var_e ;
quit;

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

Discussion stats
  • 19 replies
  • 4057 views
  • 12 likes
  • 3 in conversation