Solved: SAS/OR please help with a beginner's code

Germs · Posted 09-20-2018 08:39 AM

The data has three columns(450 raws): itemID,

Pvalue(the probability of each item happens),

rptbis (the point-biserial correlation).

e.g. variable names+ first three rows (full data set is in excel document in attachment):

ItemID pvalue rptbis
00001 0.295 0.437
00002 0.033 0.869
00003 0.988 0.964

The goal is to pick up 50 items from the pool to meet a few requirements using MIP

Objective function

Min y;

Subject to:

sum(x[i]*Pvalue[i])>32-y; /* sum of "pvalue" of the 50 items should be as close as 32*/

sum(x[i]*Pvalue[i])<32+y;

sum(x[i])=50; /*n=50 to pick up from the pool*/

x in (0, 1); /*x is a binary value in (0, 1) */

y>=0; /* I am not sure how to define thIS*/

13>=sum{ (pvalue[i]/（1-pvalue[i]))^(1/2)*rptbis[i])}>=12; /* the SD of the items picked should be between 12 and 13*/

I never get my code running, please help me with fixing the errors. Thank you!

proc optmodel;
set <str> pool;
num ItemID {pool};
num pvalue {pool};
num rptbis {pool};

read data itembank into pool=[ItemID]
pvalue rptbis;

var x{i in pool} binary; /*x in (0,1);*/

min y=sum{i in pool}(x[i]*pvalue[i])-32; /*issue #1, how can I define the objective value >0 ?*/

impvar sd=sum{i in pool}(pvalue[i]/（1-pvalue[i]）^(1/2)*rptbis[i]）;
con 13<=sd<=12; /*when I had this constraint, it said "infeasible", sad!*/

con sum{i in pool}(x[i]*pvalue[i])>32-y; /*when I had this 2 con, I wrote objective function as "min y", it said "no objective function"*/
con sum{i in pool}(x[i]*pvalue[i])<32+y;
con sum{i in pool}x[i]=50;
solve with MILP;
print x pvalue;

RobPratt · Posted 09-26-2018 01:53 PM

For question 1, I omitted sd. And for this instance, the DECOMP option speeds up the MILP solve.

proc optmodel;
   set <str> pool;
   num ItemID {pool};
   num pvalue {pool};
   num rptbis {pool};
   read data itembank into pool=[ItemID] pvalue rptbis;
   set groups = 1..5;

   var x{i in pool, j in groups} binary;
   var y{groups} >= 0;

   min z = sum{j in groups} y[j];
   
/*   impvar sd=sum{i in pool}(pvalue[i]/(1-pvalue[i])^(1/2)*rptbis[i])*x[i];*/
/*   con 12<=sd<=13;*/
   impvar psum{j in groups} = sum{i in pool} pvalue[i]*x[i,j];
   con TargetGE{j in groups}: psum[j] >= 32 - y[j];
   con TargetLE{j in groups}: psum[j] <= 32 + y[j];
   con CountPerGroup{j in groups}: sum{i in pool} x[i,j] = 50;
   con AssignAtMostOnce{i in pool}: sum{j in groups} x[i,j] <= 1;

   solve with milp / decomp;

/*   print x pvalue;*/
   print psum;

   create data soldata from [i]=pool {j in groups} <col('x'||j)=x[i,j]>;
quit;

For question 2, the answer is yes.

View solution in original post

RobPratt · Posted 09-20-2018 03:08 PM

Please share your itembank data.

RobPratt · Posted 09-25-2018 02:39 PM

I made several changes here. I'm not sure what you intended with sd, but your original IMPVAR had no variables on the right-hand side, so sd was a constant.

proc optmodel;
   set <str> pool;
   num ItemID {pool};
   num pvalue {pool};
   num rptbis {pool};
   read data itembank into pool=[ItemID]
      pvalue rptbis;

   var x{i in pool} binary; /*x in (0,1);*/

   var y >= 0;
   min z = y;*=sum{i in pool}(x[i]*pvalue[i])-32;  /*issue #1, how can I define the objective value >0 ?*/

/*   impvar sd=sum{i in pool}(pvalue[i]/(1-pvalue[i])^(1/2)*rptbis[i]);*/
   impvar sd=sum{i in pool}(pvalue[i]/(1-pvalue[i])^(1/2)*rptbis[i])*x[i];
*   con 13<=sd<=12;                                          /*when I had this constraint, it said "infeasible", sad!*/
   con 12<=sd<=13;

   con sum{i in pool}(x[i]*pvalue[i])>=32-y;      /*when I had this 2 con, I wrote objective function as "min y", it said "no objective function"*/
   con sum{i in pool}(x[i]*pvalue[i])<=32+y;
   con sum{i in pool}x[i]=50;
   solve;
   print x pvalue;
   print (sum{i in pool}(x[i]*pvalue[i]));
   print sd;
quit;

Germs · Posted 09-25-2018 10:46 PM

Thank you soooooo much!!!

After last post, I have figured out the mistakes I made, and corrected the computation of SD term. But what you wrote is much better than mine.

Here I have 2 extra questions:

Question 1. For a further modification of this program, e.g. to add a dimension, the output will be like this:

[1] x1 x2 x3 .... x5;
00001 0 0 1 .... 0
00002 0 1 0 .... 0
00003 1 0 0 .... 0

..................

00450 0 1 0 ...... 0;

which means I wish to select multiple groups x[i,j ] require:

for each group( in 1 to j) , choose 50 item from the pool:sum(x[ i ])=50;

each item can only goes into one group: sum(x)<=1, no overlap;

I created another dataset named "group" with only 1 variable (x1-x5), but I could not read x1-x5 into the dataset [pool] to make another dimension, I also don't know how to modify the program in "proc optmodel";

Question 2: can "proc optmodel" be used in SAS Macro? e.g.

%macro forms (group= );

data xxx;

set xxx;

.........;

run;

proc optmodel;

.......;

%mend;

%forms (group=5);

Thank you!!! Appreciate!!!

RobPratt · Posted 09-26-2018 01:53 PM

For question 1, I omitted sd. And for this instance, the DECOMP option speeds up the MILP solve.

proc optmodel;
   set <str> pool;
   num ItemID {pool};
   num pvalue {pool};
   num rptbis {pool};
   read data itembank into pool=[ItemID] pvalue rptbis;
   set groups = 1..5;

   var x{i in pool, j in groups} binary;
   var y{groups} >= 0;

   min z = sum{j in groups} y[j];
   
/*   impvar sd=sum{i in pool}(pvalue[i]/(1-pvalue[i])^(1/2)*rptbis[i])*x[i];*/
/*   con 12<=sd<=13;*/
   impvar psum{j in groups} = sum{i in pool} pvalue[i]*x[i,j];
   con TargetGE{j in groups}: psum[j] >= 32 - y[j];
   con TargetLE{j in groups}: psum[j] <= 32 + y[j];
   con CountPerGroup{j in groups}: sum{i in pool} x[i,j] = 50;
   con AssignAtMostOnce{i in pool}: sum{j in groups} x[i,j] <= 1;

   solve with milp / decomp;

/*   print x pvalue;*/
   print psum;

   create data soldata from [i]=pool {j in groups} <col('x'||j)=x[i,j]>;
quit;

For question 2, the answer is yes.

Germs · Posted 09-30-2018 10:11 PM

Thank you so much @RobPratt

I will continue developing the code for my project. I will update how it goes.

Germs · Posted 10-02-2018 09:55 PM

Hello RobPratt. I have tested your code, it worked greatl. Thank you so much.

For my question 2 using SAS macro, I tried using %let group=5, group 1..&group.; it worked.

For my question 1: I still have some details to work on, should I consider the objective function you wrote, means to minimize each y-value in group 1..5?

     min z = sum{j in groups} y[j];<code></code>

I tried both using and not using "decomp" option, it made a difference in speed.

For the last statement using "decomp", can I also have “iis=on”？Are they comparable?

I wish to use MILP/branch and bound/indicate IIS for my project (is that possible?)

   solve with milp /decomp iis=on;<code></code>

I tried to replace “decomp” with "IIS=on"(line 384), it gave me the following error:

384 solve with milp /decomp iis=on;

---
688
22
ERROR 688-782: The option IIS is unrecognized.

ERROR 22-322: Expecting one of the following: ABSOBJGAP, ALLCUTS, CONFLICTSEARCH, CUTCLIQUE,
CUTFLOWCOVER, CUTFLOWPATH, CUTGOMORY, CUTGUB, CUTIMPLIED, CUTKNAPSACK, CUTLAP,
CUTMILIFTED, CUTMIR, CUTOFF, CUTS, CUTSFACTOR, CUTSTRATEGY, CUTZEROHALF, DECOMP,
DECOMP_MASTER, DECOMP_MASTER_IP, DECOMP_SUBPROB, EMPHASIS, FEASTOL, HEURISTICS,
INTTOL, LOGFREQ, LOGLEVEL, MAXNODES, MAXSOLS, MAXTIME, NODESEL, NOPRIMALIN, OPTTOL,
PRESOLVER, PRIMALIN, PRINTFREQ, PRINTLEVEL2, PRIORITY, PROBE, RELOBJGAP, RESTARTS,
SCALE, SEED, STRONGITER, STRONGLEN, SYMMETRY, TARGET, TIMETYPE, VARSEL.

when I switched to LP, it relax all the constraints and did not pick any x[i] for me.

  solve with lp / iis=on;

WARNING: The problem contains integer variables. The relaxed LP will be solved. Specify the
RELAXINT keyword on the SOLVE statement to avoid this warning.
NOTE: The IIS= option is enabled.
Objective
Phase Iteration Value Time
P 1 1 1.774000E+03 0
P 1 138 1.121796E+02 0
P 1 233 8.137021E+00 0
P 1 328 3.910344E-01 0
P 1 358 0.000000E+00 0
P 1 359 0.000000E+00 0
NOTE: The IIS= option found this problem to be feasible.
NOTE: The IIS solve time is 0.02 seconds.

RobPratt · Posted 10-03-2018 05:07 PM

The IIS option for the LP solver (not the MILP solver) is to identify causes of infeasibility. Your problem is feasible, so this option is not applicable.

The y[j] variable measures the absolute error between psum[j] and the target value of 32. The objective z minimizes the sum of these errors. For your data, the optimal objective value turns out to be 0, which means that each group j meets the target exactly. You can also see this from the PRINT statement that yields:

[1]	psum
1	32
2	32
3	32
4	32
5	32

Germs · Posted 10-03-2018 10:36 PM

I have chosen one of your previous solutions (2 dimension's) as the best solution.Thank you so much!

My initial interest for SAS/OR is detecting IIS, remember you replied my first post here:

https://communities.sas.com/t5/Mathematical-Optimization/SAS-OR-in-LP-MIP-which-method-is-used-for-d...

I am a quantitative researcher but have limit knowledge in OR(this is the first time project), So I didn't really know IIS is for LP (not MILP). Thank you for telling this. I have several addition categorical constraints to add to the model, which I know will cause infeasibility. In addition to your code MILP version, I figured out for LP:

proc optmodel presolver=none;

....................
the code your developed;
....................
solve with lp relaxint;/ iis=on;

It seems to stop relaxing constraints;

There is another problem, how many nodes are normal? when I add constraint SD, for group 1..3, it worked fine; for group=4, it slowed down a little bit; when group=5, it ran with thousands of nodes and could not give a solution (maybe there wasn't one). I had to stop the program manually. Is there anyway to avoid this?

RobPratt · Posted 10-03-2018 10:43 PM

Please post the full code that is taking too long.

Germs · Posted 10-08-2018 11:21 PM

Rob, I noticed for this example (in 2-dimension), almost every constraint used "SUM", but if I want to define every single value in a linear program, e.g., pvalue[i]<0.2, how should I create this constraint? Thank you！

RobPratt · Posted 10-09-2018 11:26 AM

I don't quite understand your question. In your program, pvalue[i] is a numeric constant, not a variable.

Germs · Posted 10-09-2018 02:21 PM

To say if I want to create a quantitative constraint "choose items from the dataset with pvalue>=0.1" , how should I express this constraint in SAS/OR? This time the question should be clear. Thank you!

All the constraints in this example used "sum", tried a few times but failed.

RobPratt · Posted 10-09-2018 02:32 PM

OK, this sounds like the same situation that is addressed in this documentation example. The first approach uses an explicit constraint:

   /* do not assign customer to site more than 30 units away */
   con distance_at_most_30 {i in CUSTOMERS, j in SITES: dist[i,j] > 30}:
      Assign[i,j] = 0;

The second approach (more efficient) instead modifies the index set to avoid creating those variables at all.

Germs · Posted 10-09-2018 02:55 PM

I will try it later today to see if I can work it out. SAS/OR is difficult for beginner.

Thanks for the example. What is the second approach in details?

SAS/OR please help with a beginner's code

Re: SAS/OR please help with a beginner's code

Re: SAS/OR please help with a beginner's code

Re: SAS/OR please help with a beginner's code

Re: SAS/OR please help with a beginner's code

Re: SAS/OR please help with a beginner's code

Re: SAS/OR please help with a beginner's code

Re: SAS/OR please help with a beginner's code

Re: SAS/OR please help with a beginner's code

Re: SAS/OR please help with a beginner's code

Re: SAS/OR please help with a beginner's code

Re: SAS/OR please help with a beginner's code

Re: SAS/OR please help with a beginner's code

Re: SAS/OR please help with a beginner's code

Re: SAS/OR please help with a beginner's code

Re: SAS/OR please help with a beginner's code

Registration is open