BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
dblamcvy
Fluorite | Level 6

Hello, I'm currently using SAS 9.4 on Windows. I'm pretty new to OPTMODEL, I've tried to solve this myself, but I just can't seem to figure out how to do it. Essentially, I'm building 5 test forms from a pool of 136 items. I'm using a matrix x[136,5] of binary values that are 1 if the item is selected and 0 if not. Each item has a content domain it belongs to (A B C), and a 'height' value of a curve when used in a psychometric model. The sum of the items' curves create the form's curve. 

 

This is an extension of a previous post from months ago that has some visuals of what the optimization is for and a reference to the optimization method when generating a single form: https://communities.sas.com/t5/Mathematical-Optimization/Proc-OPTMODEL-using-objective-in-4-constrai...

 

I'm trying to generate 5 forms with the objective:

Minimize:

  • y

Subject to:

  • An item can appear a maximum of 3 times across the 5 forms
  • 9 items from content domain A are selected, 9 from B, and 12 from C
  • Maximum of 30 items per form (redundant with current setup)
  • 2 constraints con p16_TCC_pos and con p16_TCC_neg that help minimize deviation between a target test curve height (13.374) and the 5 optimally generated forms' curve values.

With just these constraints the code provided works, however, the last constraint I need is that any pair of forms can not have more than 7 items in common. I'm having problems getting this to work because I don't know how to apply the logic of the conditional compare while also using syntax appropriate for OPTMODEL into an impvar or constraint.

 

The logic I was thinking I could use to do this is along the lines of:

 

num count init 0;
for {c in compares} do;	
	for{j in 1..4, b in j+1..5} do;  *for forms 12,13,14,15,23,24,25,34,35,45;
		count=count+1;
		if count>10 then leave;
		for{i in ITEMS} do;
		if x[i,j]*x[i,b]=1 then overlap[i,count]=1; 
		*if x[i,j]=1 and x[i,b]=1 then overlap[i,count]=1;
		else  overlap[i,count]=0;
		put j b i;
		end;
	end;
end;

Running the full code will put j b and i in the log, and it is what I expect. It cycles through form pairs, and their 136 items. I'm just not sure how I would implement this, or something that accomplishes the same thing, in an impvar/constraint. If someone would be able to help me figure this out I would very much appreciate it.

 

 

I'm including 3 datasets from my 'help' library that are used in my program below.

proc optmodel;
	set <str> BINARIES; *essentially creates variable name list;
	read data help.binarytest into BINARIES=[name]; *read dataset into a data matrix, with BINARIES variable names;

	set <str> NUMERICS;
	read data help.numerictest into NUMERICS=[name];

	set <num> ITEMS;
	num rnum {ITEMS}; *list of rnum values, {items} in length;
	num binary {BINARIES, ITEMS} ; *Creates matrix size [binaries x items] ---tried labeling as var and binary, but when printing at end, does not populate with 1&0 only 0's;
	num numeric {NUMERICS, ITEMS};

	*probably a better way to do this, but this seems to work;
	*read data from pool dataset and put into ITEMS numbered 1->N, put rnum, and binary and numeric data into data matrix;
	read data help.test_ref_pool_cln into ITEMS=[_N_] rnum /*was uin instead of rnum*/ 
		{i in BINARIES} <binary[i,_N_]=col(i)>
		{i in NUMERICS} <numeric[i,_N_]=col(i)>;

	set forms = 1..5;
	set compares = 1..10;
*****************************************************************************************************************;
*model specification;
	var x{ITEMS,forms} BINARY,  y >=0, overlap{ITEMS,compares} BINARY; *x is selection 1/0,  y is deviation;
	min object=y;

*only impvar total_p16 is in actual use now;
*first theta point, P P^2 P^3;
	impvar total_p16 {j in forms} = sum{i in ITEMS}numeric["p1_6",i]*x[i,j];

*Trying to get [items x comparisons] to then sum across items to get a row of 10 columns that contain how many items are in common
			in each pair of forms.;

num count init 0;
for {c in compares} do;	
	for{j in 1..4, b in j+1..5} do;*for 12,13,14,15,23,24,25,34,35,45 items 1 to 136 do;
		count=count+1;
		if count>10 then leave;
		for{i in ITEMS} do;
		if x[i,j]*x[i,b]=1 then overlap[i,count]=1; 
		*if x[i,j]=1 and x[i,b]=1 then overlap[i,count]=1;
		else  overlap[i,count]=0;
		put j b i;
		end;
	end;
end;
	;

*Same item can't appear on more than 3 forms;
	con max_three {i in items}: sum{j in forms} x[i,j] <=3;

*constraints for first theta point P();
	con p16_TCC_pos {j in forms}: total_p16[j] <= 13.374+y; /*@-.5: P pos deviation*/
	con p16_TCC_neg {j in forms}: total_p16[j] >= 13.374-y; /*@-.5: P neg deviation*/

*content constraints;
	con Acon {j in forms}:  	 	sum{i in ITEMS}binary["con_A",i]*x[i,j]	=9; 	*9 from content A;
	con Bcon {j in forms}:  		sum{i in ITEMS}binary["con_B",i]*x[i,j]	=9; 	*9 from content B;
	con Ccon {j in forms}:  		sum{i in ITEMS}binary["con_C",i]*x[i,j]	=12;	*12 from content C;

*total number of items per form;
	con max_items {j in forms}: sum{i in ITEMS}x[i,j]=30; /*select 30 items from the pool*/

*I want this to be the constraint that makes sure each column of the sums of overlap among pairs is less than 7;
	*con overlap {c in compares}: overlap <=7;

	solve obj object with milp;
	create data testset3 from [id]={i in items}{j in forms} <col("form"||j)=x[i,j]>;
quit;

In the dataset test_ref_pool_cln's variables p2_6, p1_9, and p2_9 are other target curve values. These correspond to additional constraints I have omitted here because I'm having memory usage problems again, probably because I'm doing everything inefficiently, so they aren't used as part of the objective for now.

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
RobPratt
SAS Super FREQ

You want to impose the following quadratic constraint:

set FORM_PAIRS = {j in FORMS, b in FORMS: j < b};
con overlap {<j,b> in FORM_PAIRS}:
   sum {i in ITEMS} x[i,j] * x[i,b] <= 7;

But the MILP solver would yield an error message:

ERROR: The specified optimization technique does not allow nonlinear constraints.

Instead, you can linearize as follows and call the MILP solver:

set FORM_PAIRS = {j in FORMS, b in FORMS: j < b};
var z {ITEMS, FORM_PAIRS} binary;
/* if x[i,j] = 1 and x[i,b] = 1 then z[i,j,b] = 1 */
con linearization {i in ITEMS, <j,b> in FORM_PAIRS}:
   x[i,j] + x[i,b] - 1 <= z[i,j,b];
con overlap {<j,b> in FORM_PAIRS}:
   sum {i in ITEMS} z[i,j,b] <= 7;

By the way, we are working on automating various linearization techniques, including this one, to be released soon.  Stay tuned!

 

View solution in original post

5 REPLIES 5
JosvanderVelden
SAS Super FREQ

What memory problems are you experiencing? Do you get any errors or warnings?
Have you tried running the program on SAS OnDemand for Academics?

I just ran your code in SAS OnDemand for Academics:

 NOTE: The data set WORK.TESTSET3 has 136 observations and 6 variables.
 139        quit;
 NOTE: PROCEDURE OPTMODEL used (Total process time):
       real time           2.76 seconds
       user cpu time       3.59 seconds
       system cpu time     1.05 seconds
       memory              101982.00k
       OS Memory           129332.00k
       Timestamp           27/11/2020

The results are attached as pdf.

Best regards, Jos

dblamcvy
Fluorite | Level 6

@JosvanderVelden I'm sorry, I should have been clearer in my post. I've edited it now to try to do that. The memory issue occurs when I use 4 references to the target curve height. I omitted the other 3 so that the program would run smoother when asking for help with the overlapping items issue. For completeness to this post and your question, the additional lines of code to have all constraints needed (and cause the memory problem) are :

	impvar total_p26 {j in forms} = sum{i in ITEMS}numeric["p2_6",i]*x[i,j];
	impvar total_p19 {j in forms} = sum{i in ITEMS}numeric["p1_9",i]*x[i,j];
	impvar total_p29 {j in forms} = sum{i in ITEMS}numeric["p2_9",i]*x[i,j];

	con p26_TCC_pos {j in forms}: total_p26[j] <= 7.732+y; /*@-.5: P^2 pos deviation*/
	con p26_TCC_neg {j in forms}: total_p26[j] >= 7.732-y; /*@-.5: P^2 neg deviation*/
	con p19_TCC_pos {j in forms}: total_p19[j] <= 22.151+y; /*@1: P pos deviation*/
	con p19_TCC_neg {j in forms}: total_p19[j] >= 22.151-y; /*@1: P neg deviation*/
	con p29_TCC_pos {j in forms}: total_p29[j] <= 17.322+y; /*@1: P^2 pos deviation*/
	con p29_TCC_neg {j in forms}: total_p29[j] >= 17.322-y; /*@1: P^2 neg deviation*/

I know this isn't he most efficient way to code this, but I'm still learning so I'm trying to make sure I grasp each step when I make the code more complex (I guess I'll see how well that works out).

 

My old post I linked to up above dealt with a similar issue. The 'solution' then was to just not use as many decimal places, I rounded the values to post on the forum for help (and didn't run it with the new values) and it worked fine, so I felt pretty silly.

 

Thanks so much for your time and for the information on SAS OnDemand for Academics, I'll definitely be checking that out.

RobPratt
SAS Super FREQ

You want to impose the following quadratic constraint:

set FORM_PAIRS = {j in FORMS, b in FORMS: j < b};
con overlap {<j,b> in FORM_PAIRS}:
   sum {i in ITEMS} x[i,j] * x[i,b] <= 7;

But the MILP solver would yield an error message:

ERROR: The specified optimization technique does not allow nonlinear constraints.

Instead, you can linearize as follows and call the MILP solver:

set FORM_PAIRS = {j in FORMS, b in FORMS: j < b};
var z {ITEMS, FORM_PAIRS} binary;
/* if x[i,j] = 1 and x[i,b] = 1 then z[i,j,b] = 1 */
con linearization {i in ITEMS, <j,b> in FORM_PAIRS}:
   x[i,j] + x[i,b] - 1 <= z[i,j,b];
con overlap {<j,b> in FORM_PAIRS}:
   sum {i in ITEMS} z[i,j,b] <= 7;

By the way, we are working on automating various linearization techniques, including this one, to be released soon.  Stay tuned!

 

dblamcvy
Fluorite | Level 6

Thank you so much!

That is a much more elegant solution than I was messing with. I think that what you said was the exact track I was on too. I tried the first solution and got the error about it being non-linear, so I tried to move to handling it in a different way.

 

I'm excited to see the new linearization feature. I started reading about optimization and thought it was so interesting that it could be applied to my field. Since then I've been trying to learn about OPTMODEL in SAS and applying it to work typical in my field. I've been viewing a lot of apples in bag problems as items in tests problems.

 

RobPratt
SAS Super FREQ

The linearization feature I mentioned is now available.  For your example, here is how to use it:

   set FORM_PAIRS = {j in FORMS, b in FORMS: j < b};
   con overlap {<j,b> in FORM_PAIRS}: sum {i in ITEMS} x[i,j] * x[i,b] <= 7;
   solve linearize;

OPTMODEL introduces the new binary variables and constraints for you under the hood.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1040 views
  • 2 likes
  • 3 in conversation