Hello community,
I am starting using Proc Opt model, and I would like to minimize a function like this one:
min|y|=B1*type + B2*b + B3*c + B4*d + B5*e + B6
Where values are taken from a dataset, here the values:
Y should be an integer number and this is the distribution
B1, B2, B3, B4, B5 and B6 are constants
'type' should be a string with values "white", "black"
Should I have different results for each string?
'b' should be with values 2 and 12
'c' should be in a float value in this distribution
'd' should be in an interval 1 to 48 integer values
'e' should be with values 0, 2 or 12
Thank you in advance
@RobPratt @Ksharp
Yes but I would like to use the read data after I imported my xlsx, is that possible?
I don't quite understand your problem. Do you want to solve a regression problem where b, c, d, and e are regression coefficients? If B1 is a constant and type is a string, what does B1*type mean? Are you able to share example data and also an example (not necessarily optimal) solution?
Yes,
For example, I would like to minimize y in this dataset. I was planning to set a lower limit of y to 1 because I don’t want it to be zero, and I also need to include constraints that are not explicitly visible in my data. For instance, the values in column 'a' should be within the integer range from 6 to 48.
Could you provide an example of how to minimize a regression function for y using this dataset or by applying the constraints I want, instead of using one of my columns?
Type is an example of a string column where I should perform minimization for each string value, but for now, we can ignore it to simplify the problem.
The following PROC OPTMODEL code solves an ordinary least squares regression:
proc optmodel;
/* declare sets and input parameters */
set OBS;
num y {OBS};
num a {OBS};
num b {OBS};
num c {OBS};
num d {OBS};
num e {OBS};
/* read input data set */
read data input into OBS=[_N_] y a b c d e;
/* declare decision variables */
var Beta {0..5};
var Error {OBS};
impvar Estimate {i in OBS} = Beta[0] + Beta[1]*a[i] + Beta[2]*b[i] + Beta[3]*c[i] + Beta[4]*d[i] + Beta[5]*e[i];
/* declare objective to minimize sum of squared errors */
min SSE = sum {i in OBS} Error[i]^2;
/* declare constraints */
con ErrorCon {i in OBS}:
Error[i] = Estimate[i] - y[i];
/* call quadratic programming (QP) solver */
solve;
/* print regression coefficients */
print Beta;
/* create output data sets */
create data BetaData from [j] Beta;
create data EstimateData from [i] a b c d e y Estimate Error;
quit;
The resulting regression coefficients are:
| [1] | Beta |
|---|---|
| 0 | 416.9191 |
| 1 | 19.3854 |
| 2 | 18.5124 |
| 3 | -13.0827 |
| 4 | -22.7077 |
| 5 | -1.4499 |
proc reg data=input;
model y = a b c d e;
quit;
I don't want to find the betas, I already have the beta values. I would like to find the a, b, c , d , e values in an interval from my dataset or an interval that I will determine to minimize the y value. It's just an equation where my betas can be positive or negatives.
If I understood what you mean.
proc optmodel;
num a{1..2}=[2 12];
num b{1..3}=[2 12 24];
num d{i in 1..10}=i;
num e{i in 1..47}=i;
var var_a{1..2} binary;
var var_b{1..3} binary;
var var_c >=5 <=15;
var var_d{1..10} binary;
var var_e{1..47} binary;
impvar sum=-8*sum{i in 1..2} a[i]*var_a[i] +
-2*sum{i in 1..3} b[i]*var_b[i] +
-2*var_c +
-4*sum{i in 1..10}d[i]*var_d[i] +
2*sum{i in 1..47}e[i]*var_e[i] ;
min obj=sum;
con con_total:sum>=1;
con con_a:sum{i in 1..2} var_a[i]=1;
con con_b:sum{i in 1..3} var_b[i]=1;
con con_d:sum{i in 1..10}var_d[i]=1;
con con_e:sum{i in 1..47}var_e[i]=1;
solve ;
print a var_a ;
print b var_b ;
print var_c ;
print d var_d ;
print e var_e ;
quit;
Thanks for posting this reply. +1
A slight improvement is that for variables d and e with consecutive integer domains, you can avoid num d, num e, con_d, and con_e by using integer variables:
var var_d >= 1 <= 10 integer;
var var_e >= 1 <= 47 integer;
Thank you, I will try by today and tomorrow if it is work. If I would like to use steps of 4 within the interval? How can i change it?
Here's one way to enforce that x is in {2,6,10,...,42}:
var x;
set MYSET = 2..42 by 4;
var y {MYSET} binary;
con Mycon1: sum {i in MYSET} y[i] = 1;
con Mycon2: sum {i in MYSET} i*y[i] = x;
Here's another way that uses fewer variables and constraints:
var x >= 2 <= 42;
var y integer;
con Mycon: x = 2 + 4*y;
See also https://or.stackexchange.com/questions/6545/how-to-linearize-membership-in-a-finite-set
How can I integrate that in the code of the proc optmodel above? @RobPratt
How can I insert these con
var x >= 2 <= 42;
var y integer;
con Mycon: x = 2 + 4*y;
to this proc opt model
proc optmodel;
num a{1..2}=[2 12];
num b{1..3}=[2 12 24];
num d{i in 1..10}=i;
num e{i in 1..47}=i;
var var_a{1..2} binary;
var var_b{1..3} binary;
var var_c >=5 <=15;
var var_d{1..10} binary;
var var_e{1..47} binary;
impvar sum=-8*sum{i in 1..2} a[i]*var_a[i] +
-2*sum{i in 1..3} b[i]*var_b[i] +
-2*var_c +
-4*sum{i in 1..10}d[i]*var_d[i] +
2*sum{i in 1..47}e[i]*var_e[i] ;
min obj=sum;
con con_total:sum>=1;
con con_a:sum{i in 1..2} var_a[i]=1;
con con_b:sum{i in 1..3} var_b[i]=1;
con con_d:sum{i in 1..10}var_d[i]=1;
con con_e:sum{i in 1..47}var_e[i]=1;
solve ;
print a var_a ;
print b var_b ;
print var_c ;
print d var_d ;
print e var_e ;
quit;
proc optmodel;
num a{1..2}=[2 12];
num b{1..3}=[2 12 24];
num d{i in 1..10}=i;
num e{i in 1..47}=i;
var var_a{1..2} binary;
var var_b{1..3} binary;
var var_c >=2 <=42 ;
var var_d integer;
impvar sum= 4*sum{i in 1..2} a[i]*var_a[i] +
-2*sum{i in 1..3} b[i]*var_b[i] +
-2*var_c +
4*var_d ;
min obj=sum;
con con_total:sum>=1;
con con_a:sum{i in 1..2} var_a[i]=1;
con con_b:sum{i in 1..3} var_b[i]=1;
con con_d:var_c=2+4*var_d;
solve ;
print a var_a ;
print b var_b ;
print var_c ;
print var_d ;
quit;
If I would like to read the data from my dataset?
I don't understand. If you want to import these interval of variable from XLSX, you could make a macro variable (assuming the levels of these variable was not too many).
proc import datafile="C:\Users\xiakeshan\Downloads\input.xlsx" out=have dbms=xlsx replace;
run;
proc sql noprint;
select distinct a into :a separated by ' ' from have;
select count(distinct a) into :_a separated by ' ' from have;
select distinct b into :b separated by ' ' from have;
select count(distinct b) into :_b separated by ' ' from have;
select floor(min(c)) into :min_c from have;
select ceil(max(c)) into :max_c from have;
select distinct d into :d separated by ' ' from have;
select count(distinct d) into :_d separated by ' ' from have;
select distinct e into :e separated by ' ' from have;
select count(distinct e) into :_e separated by ' ' from have;
quit;
proc optmodel;
num a{1..&_a.}=[&a.];
num b{1..&_b.}=[&b.];
num d{1..&_d.}=[&d.];
num e{1..&_e.}=[&e.];
var var_a{1..&_a.} binary;
var var_b{1..&_b.} binary;
var var_c >=&min_c. <=&max_c.;
var var_d{1..&_d.} binary;
var var_e{1..&_e.} binary;
impvar sum=-8*sum{i in 1..&_a.} a[i]*var_a[i] +
-2*sum{i in 1..&_b.} b[i]*var_b[i] +
-2*var_c +
-4*sum{i in 1..&_d.}d[i]*var_d[i] +
2*sum{i in 1..&_e.}e[i]*var_e[i] ;
min obj=sum;
con con_total:sum>=1;
con con_a:sum{i in 1..&_a.} var_a[i]=1;
con con_b:sum{i in 1..&_b.} var_b[i]=1;
con con_d:sum{i in 1..&_d.} var_d[i]=1;
con con_e:sum{i in 1..&_e.} var_e[i]=1;
solve ;
print a var_a ;
print b var_b ;
print var_c ;
print d var_d ;
print e var_e ;
quit;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and save with the early bird rate—just $795!