Hello, I'm currently using SAS 9.4 on Windows. I'm pretty new to OPTMODEL, I've tried to solve this myself, but I just can't seem to figure out how to do it. Essentially, I'm building 5 test forms from a pool of 136 items. I'm using a matrix x[136,5] of binary values that are 1 if the item is selected and 0 if not. Each item has a content domain it belongs to (A B C), and a 'height' value of a curve when used in a psychometric model. The sum of the items' curves create the form's curve.
This is an extension of a previous post from months ago that has some visuals of what the optimization is for and a reference to the optimization method when generating a single form: https://communities.sas.com/t5/Mathematical-Optimization/Proc-OPTMODEL-using-objective-in-4-constrai...
I'm trying to generate 5 forms with the objective:
Minimize:
Subject to:
con p16_TCC_pos
and con p16_TCC_neg
that help minimize deviation between a target test curve height (13.374) and the 5 optimally generated forms' curve values.With just these constraints the code provided works, however, the last constraint I need is that any pair of forms can not have more than 7 items in common. I'm having problems getting this to work because I don't know how to apply the logic of the conditional compare while also using syntax appropriate for OPTMODEL into an impvar or constraint.
The logic I was thinking I could use to do this is along the lines of:
num count init 0; for {c in compares} do; for{j in 1..4, b in j+1..5} do; *for forms 12,13,14,15,23,24,25,34,35,45; count=count+1; if count>10 then leave; for{i in ITEMS} do; if x[i,j]*x[i,b]=1 then overlap[i,count]=1; *if x[i,j]=1 and x[i,b]=1 then overlap[i,count]=1; else overlap[i,count]=0; put j b i; end; end; end;
Running the full code will put j b and i in the log, and it is what I expect. It cycles through form pairs, and their 136 items. I'm just not sure how I would implement this, or something that accomplishes the same thing, in an impvar/constraint. If someone would be able to help me figure this out I would very much appreciate it.
I'm including 3 datasets from my 'help' library that are used in my program below.
proc optmodel;
set <str> BINARIES; *essentially creates variable name list;
read data help.binarytest into BINARIES=[name]; *read dataset into a data matrix, with BINARIES variable names;
set <str> NUMERICS;
read data help.numerictest into NUMERICS=[name];
set <num> ITEMS;
num rnum {ITEMS}; *list of rnum values, {items} in length;
num binary {BINARIES, ITEMS} ; *Creates matrix size [binaries x items] ---tried labeling as var and binary, but when printing at end, does not populate with 1&0 only 0's;
num numeric {NUMERICS, ITEMS};
*probably a better way to do this, but this seems to work;
*read data from pool dataset and put into ITEMS numbered 1->N, put rnum, and binary and numeric data into data matrix;
read data help.test_ref_pool_cln into ITEMS=[_N_] rnum /*was uin instead of rnum*/
{i in BINARIES} <binary[i,_N_]=col(i)>
{i in NUMERICS} <numeric[i,_N_]=col(i)>;
set forms = 1..5;
set compares = 1..10;
*****************************************************************************************************************;
*model specification;
var x{ITEMS,forms} BINARY, y >=0, overlap{ITEMS,compares} BINARY; *x is selection 1/0, y is deviation;
min object=y;
*only impvar total_p16 is in actual use now;
*first theta point, P P^2 P^3;
impvar total_p16 {j in forms} = sum{i in ITEMS}numeric["p1_6",i]*x[i,j];
*Trying to get [items x comparisons] to then sum across items to get a row of 10 columns that contain how many items are in common
in each pair of forms.;
num count init 0;
for {c in compares} do;
for{j in 1..4, b in j+1..5} do;*for 12,13,14,15,23,24,25,34,35,45 items 1 to 136 do;
count=count+1;
if count>10 then leave;
for{i in ITEMS} do;
if x[i,j]*x[i,b]=1 then overlap[i,count]=1;
*if x[i,j]=1 and x[i,b]=1 then overlap[i,count]=1;
else overlap[i,count]=0;
put j b i;
end;
end;
end;
;
*Same item can't appear on more than 3 forms;
con max_three {i in items}: sum{j in forms} x[i,j] <=3;
*constraints for first theta point P();
con p16_TCC_pos {j in forms}: total_p16[j] <= 13.374+y; /*@-.5: P pos deviation*/
con p16_TCC_neg {j in forms}: total_p16[j] >= 13.374-y; /*@-.5: P neg deviation*/
*content constraints;
con Acon {j in forms}: sum{i in ITEMS}binary["con_A",i]*x[i,j] =9; *9 from content A;
con Bcon {j in forms}: sum{i in ITEMS}binary["con_B",i]*x[i,j] =9; *9 from content B;
con Ccon {j in forms}: sum{i in ITEMS}binary["con_C",i]*x[i,j] =12; *12 from content C;
*total number of items per form;
con max_items {j in forms}: sum{i in ITEMS}x[i,j]=30; /*select 30 items from the pool*/
*I want this to be the constraint that makes sure each column of the sums of overlap among pairs is less than 7;
*con overlap {c in compares}: overlap <=7;
solve obj object with milp;
create data testset3 from [id]={i in items}{j in forms} <col("form"||j)=x[i,j]>;
quit;
In the dataset test_ref_pool_cln's variables p2_6, p1_9, and p2_9 are other target curve values. These correspond to additional constraints I have omitted here because I'm having memory usage problems again, probably because I'm doing everything inefficiently, so they aren't used as part of the objective for now.
You want to impose the following quadratic constraint:
set FORM_PAIRS = {j in FORMS, b in FORMS: j < b};
con overlap {<j,b> in FORM_PAIRS}:
sum {i in ITEMS} x[i,j] * x[i,b] <= 7;
But the MILP solver would yield an error message:
ERROR: The specified optimization technique does not allow nonlinear constraints.
Instead, you can linearize as follows and call the MILP solver:
set FORM_PAIRS = {j in FORMS, b in FORMS: j < b};
var z {ITEMS, FORM_PAIRS} binary;
/* if x[i,j] = 1 and x[i,b] = 1 then z[i,j,b] = 1 */
con linearization {i in ITEMS, <j,b> in FORM_PAIRS}:
x[i,j] + x[i,b] - 1 <= z[i,j,b];
con overlap {<j,b> in FORM_PAIRS}:
sum {i in ITEMS} z[i,j,b] <= 7;
By the way, we are working on automating various linearization techniques, including this one, to be released soon. Stay tuned!
What memory problems are you experiencing? Do you get any errors or warnings?
Have you tried running the program on SAS OnDemand for Academics?
I just ran your code in SAS OnDemand for Academics:
NOTE: The data set WORK.TESTSET3 has 136 observations and 6 variables. 139 quit; NOTE: PROCEDURE OPTMODEL used (Total process time): real time 2.76 seconds user cpu time 3.59 seconds system cpu time 1.05 seconds memory 101982.00k OS Memory 129332.00k Timestamp 27/11/2020
The results are attached as pdf.
Best regards, Jos
@JosvanderVelden I'm sorry, I should have been clearer in my post. I've edited it now to try to do that. The memory issue occurs when I use 4 references to the target curve height. I omitted the other 3 so that the program would run smoother when asking for help with the overlapping items issue. For completeness to this post and your question, the additional lines of code to have all constraints needed (and cause the memory problem) are :
impvar total_p26 {j in forms} = sum{i in ITEMS}numeric["p2_6",i]*x[i,j];
impvar total_p19 {j in forms} = sum{i in ITEMS}numeric["p1_9",i]*x[i,j];
impvar total_p29 {j in forms} = sum{i in ITEMS}numeric["p2_9",i]*x[i,j];
con p26_TCC_pos {j in forms}: total_p26[j] <= 7.732+y; /*@-.5: P^2 pos deviation*/
con p26_TCC_neg {j in forms}: total_p26[j] >= 7.732-y; /*@-.5: P^2 neg deviation*/
con p19_TCC_pos {j in forms}: total_p19[j] <= 22.151+y; /*@1: P pos deviation*/
con p19_TCC_neg {j in forms}: total_p19[j] >= 22.151-y; /*@1: P neg deviation*/
con p29_TCC_pos {j in forms}: total_p29[j] <= 17.322+y; /*@1: P^2 pos deviation*/
con p29_TCC_neg {j in forms}: total_p29[j] >= 17.322-y; /*@1: P^2 neg deviation*/
I know this isn't he most efficient way to code this, but I'm still learning so I'm trying to make sure I grasp each step when I make the code more complex (I guess I'll see how well that works out).
My old post I linked to up above dealt with a similar issue. The 'solution' then was to just not use as many decimal places, I rounded the values to post on the forum for help (and didn't run it with the new values) and it worked fine, so I felt pretty silly.
Thanks so much for your time and for the information on SAS OnDemand for Academics, I'll definitely be checking that out.
You want to impose the following quadratic constraint:
set FORM_PAIRS = {j in FORMS, b in FORMS: j < b};
con overlap {<j,b> in FORM_PAIRS}:
sum {i in ITEMS} x[i,j] * x[i,b] <= 7;
But the MILP solver would yield an error message:
ERROR: The specified optimization technique does not allow nonlinear constraints.
Instead, you can linearize as follows and call the MILP solver:
set FORM_PAIRS = {j in FORMS, b in FORMS: j < b};
var z {ITEMS, FORM_PAIRS} binary;
/* if x[i,j] = 1 and x[i,b] = 1 then z[i,j,b] = 1 */
con linearization {i in ITEMS, <j,b> in FORM_PAIRS}:
x[i,j] + x[i,b] - 1 <= z[i,j,b];
con overlap {<j,b> in FORM_PAIRS}:
sum {i in ITEMS} z[i,j,b] <= 7;
By the way, we are working on automating various linearization techniques, including this one, to be released soon. Stay tuned!
Thank you so much!
That is a much more elegant solution than I was messing with. I think that what you said was the exact track I was on too. I tried the first solution and got the error about it being non-linear, so I tried to move to handling it in a different way.
I'm excited to see the new linearization feature. I started reading about optimization and thought it was so interesting that it could be applied to my field. Since then I've been trying to learn about OPTMODEL in SAS and applying it to work typical in my field. I've been viewing a lot of apples in bag problems as items in tests problems.
The linearization feature I mentioned is now available. For your example, here is how to use it:
set FORM_PAIRS = {j in FORMS, b in FORMS: j < b};
con overlap {<j,b> in FORM_PAIRS}: sum {i in ITEMS} x[i,j] * x[i,b] <= 7;
solve linearize;
OPTMODEL introduces the new binary variables and constraints for you under the hood.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.