Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

☑ This topic is **solved**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 05-24-2022 10:21 AM
(1193 views)

Hello,

I have a dataset like the following:

data have;

id=1; x1=10; x2=3; x3=5; y=220;

y1g=12; y2g=6; y3g=10;

output;

id=2; x1=7; x2=0; x3=3; y=100;

y1g=12; y2g=0; y3g=10;

output;

run;

for each record I want to find values for y1,y2,y3

such that

x1*y1+x2*y2+x3*y3=y

0=<y1<=12

0=<y2<=6

0=<y3<=10

I am not sure of which can be the objective function, but to fix the ideas

I might want that the solutions are as similar as possible to some initial guess, y1g,y2g,y3g.

so for instance the objective might be:

min (y1-y1g)**2+(y2-y2g)**2+(y3-y3g)**2

I am using SAS9 M5 and have to apply the program to a dataset of about 100000 records.

I am very new to sas/or but wishing to learn it at least a little.

thank you very much in advance for any help or suggestions

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

This might help get you started with the syntax.

The first row is infeasible due to the constraint. If all three decision variables are at their upper bounds, 10*y1 + 3*y2 + 5*y3 equals 188, less than the y value of 220. The solution status column in the output data set captures this information for all rows.

```
proc optmodel;
/* declare an index set */
set <num> ROWS;
/* declare parameters for each row in the set */
num x1{ROWS};
num x2{ROWS};
num x3{ROWS};
num y{ROWS};
num y1g{ROWS};
num y2g{ROWS};
num y3g{ROWS};
/* read the have data set and parameters into OPTMODEL to populate */
read data have into ROWS=[id] x1 x2 x3 y y1g y2g y3g;
/* iteration scalar to loop through and solve each row */
num i;
/* decision variables for each row */
var y1{ROWS} >= 0 <= 12;
var y2{ROWS} >= 0 <= 6;
var y3{ROWS} >= 0 <= 10;
/* declare and later populate dec variables with optimal values */
num y1sol{ROWS};
num y2sol{ROWS};
num y3sol{ROWS};
/* declare and later populate solution status of the run */
str solstatus {ROWS};
con Equality: x1[i]*y1[i] + x2[i]*y2[i] + x3[i]*y3[i] = y[i];
min Obj = (y1[i]-y1g[i])**2+(y2[i]-y2g[i])**2+(y3[i]-y3g[i])**2;
/* loop to solve for each row */
do i = 1 to card(ROWS);
solve;
*print y1[i] y2[i] y3[i];
y1sol[i] = y1[i].sol;
y2sol[i] = y2[i].sol;
y3sol[i] = y3[i].sol;
solstatus[i] = _solution_status_;
end;
/* create output data set */
create data want from [id]=ROWS
x1 x2 x3 y y1g y2g y3g y1sol y2sol y3sol solstatus;
quit;
```

8 REPLIES 8

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thank you very much Rick.

I tried to learn by my self, but a bit in a hurry.

I would appreciate if you could suggest me the code.

thank you in advance

I tried to learn by my self, but a bit in a hurry.

I would appreciate if you could suggest me the code.

thank you in advance

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

This might help get you started with the syntax.

The first row is infeasible due to the constraint. If all three decision variables are at their upper bounds, 10*y1 + 3*y2 + 5*y3 equals 188, less than the y value of 220. The solution status column in the output data set captures this information for all rows.

```
proc optmodel;
/* declare an index set */
set <num> ROWS;
/* declare parameters for each row in the set */
num x1{ROWS};
num x2{ROWS};
num x3{ROWS};
num y{ROWS};
num y1g{ROWS};
num y2g{ROWS};
num y3g{ROWS};
/* read the have data set and parameters into OPTMODEL to populate */
read data have into ROWS=[id] x1 x2 x3 y y1g y2g y3g;
/* iteration scalar to loop through and solve each row */
num i;
/* decision variables for each row */
var y1{ROWS} >= 0 <= 12;
var y2{ROWS} >= 0 <= 6;
var y3{ROWS} >= 0 <= 10;
/* declare and later populate dec variables with optimal values */
num y1sol{ROWS};
num y2sol{ROWS};
num y3sol{ROWS};
/* declare and later populate solution status of the run */
str solstatus {ROWS};
con Equality: x1[i]*y1[i] + x2[i]*y2[i] + x3[i]*y3[i] = y[i];
min Obj = (y1[i]-y1g[i])**2+(y2[i]-y2g[i])**2+(y3[i]-y3g[i])**2;
/* loop to solve for each row */
do i = 1 to card(ROWS);
solve;
*print y1[i] y2[i] y3[i];
y1sol[i] = y1[i].sol;
y2sol[i] = y2[i].sol;
y3sol[i] = y3[i].sol;
solstatus[i] = _solution_status_;
end;
/* create output data set */
create data want from [id]=ROWS
x1 x2 x3 y y1g y2g y3g y1sol y2sol y3sol solstatus;
quit;
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thank you very much ChanceTGardener.

I appreciated it a lot and especially the comments that helps me to understand the code.

Ciro

I appreciated it a lot and especially the comments that helps me to understand the code.

Ciro

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I have a problem when applying the procedure to my dataset of about 100k observations.

when I tried with a subset of 10k observations it took about 6 minutes.

so I imagined that 1-2 hours was a sufficient time to process the whole dataset.

Instead after a night the procedure was not finished.

I closed all output destination and routed the log with proc printto.

I also tried to suppress the the notes to log, but with no success.

Any idea of how can I complete the task? Should I partition the problem, dividing the dataset, or is there a smarter solution?

Thank you very much in advance

when I tried with a subset of 10k observations it took about 6 minutes.

so I imagined that 1-2 hours was a sufficient time to process the whole dataset.

Instead after a night the procedure was not finished.

I closed all output destination and routed the log with proc printto.

I also tried to suppress the the notes to log, but with no success.

Any idea of how can I complete the task? Should I partition the problem, dividing the dataset, or is there a smarter solution?

Thank you very much in advance

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

The original code works correctly but has many unused variables (three times the number of rows when you need only three). You will get much better performance by replacing the three VAR statements as shown below. When calling the solver in a loop, I often use PRINTLEVEL=0 and OPTION NONOTES and also a PUT statement to show the progress, as shown.

```
proc optmodel printlevel=0;
/* declare an index set */
set <num> ROWS;
/* declare parameters for each row in the set */
num x1{ROWS};
num x2{ROWS};
num x3{ROWS};
num y{ROWS};
num y1g{ROWS};
num y2g{ROWS};
num y3g{ROWS};
/* read the have data set and parameters into OPTMODEL to populate */
read data have into ROWS=[id] x1 x2 x3 y y1g y2g y3g;
/* iteration scalar to loop through and solve each row */
num i;
/* decision variables for each row */
/* var y1{ROWS} >= 0 <= 12;*/
/* var y2{ROWS} >= 0 <= 6;*/
/* var y3{ROWS} >= 0 <= 10;*/
var y1 >= 0 <= 12;
var y2 >= 0 <= 6;
var y3 >= 0 <= 10;
/* declare and later populate dec variables with optimal values */
num y1sol{ROWS};
num y2sol{ROWS};
num y3sol{ROWS};
/* declare and later populate solution status of the run */
str solstatus {ROWS};
/* con Equality: x1[i]*y1[i] + x2[i]*y2[i] + x3[i]*y3[i] = y[i];*/
con Equality: x1[i]*y1 + x2[i]*y2 + x3[i]*y3 = y[i];
/* min Obj = (y1[i]-y1g[i])**2+(y2[i]-y2g[i])**2+(y3[i]-y3g[i])**2;*/
min Obj = (y1-y1g[i])**2+(y2-y2g[i])**2+(y3-y3g[i])**2;
/* loop to solve for each row */
option nonotes;
do i = 1 to card(ROWS);
if mod(i,1000) = 0 then put i=;
solve;
*print y1[i] y2[i] y3[i];
/* y1sol[i] = y1[i].sol;*/
/* y2sol[i] = y2[i].sol;*/
/* y3sol[i] = y3[i].sol;*/
y1sol[i] = y1.sol;
y2sol[i] = y2.sol;
y3sol[i] = y3.sol;
solstatus[i] = _solution_status_;
end;
option notes;
/* create output data set */
create data want from [id]=ROWS
x1 x2 x3 y y1g y2g y3g y1sol y2sol y3sol solstatus;
quit;
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

And in SAS Viya, you can use the runOptmodel action with BY-group processing to solve the independent instances concurrently, without having to write any explicit loops:

```
data sascas1.have;
set have;
run;
proc cas;
action setsessopt / messageLevel="warning";
run;
source pgm;
/* declare parameters for each row in the set */
num x1;
num x2;
num x3;
num y;
num y1g;
num y2g;
num y3g;
/* read the have data set and parameters into OPTMODEL to populate */
read data have into x1 x2 x3 y y1g y2g y3g;
/* decision variables for each row */
var y1 >= 0 <= 12;
var y2 >= 0 <= 6;
var y3 >= 0 <= 10;
con Equality: x1*y1 + x2*y2 + x3*y3 = y;
min Obj = (y1-y1g)**2+(y2-y2g)**2+(y3-y3g)**2;
/* solve one row */
solve;
/* create output data set */
create data want from
x1 x2 x3 y y1g y2g y3g y1sol=y1 y2sol=y2 y3sol=y3 solstatus=_solution_status_;
endsource;
action optimization.runOptmodel / code=pgm groupBy={"id"} printlevel=0;
run;
quit;
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thank you very much Rob. the code on the entire dataset run in only about 10 minutes!

The biggest improvement is probably the reduction of the number of variables, However a role is likely to be played also by printlevel=0.

I realized that without it Sas kept on printing on the html output making it grow continuously and congesting the system.

The strange thing is that I had on my code: ods html close; and ods html exclude all. So there is something that I misunderstand about these statements.

However with printlevel=0, sas stopped to print the output.

**Don't miss out on SAS Innovate - Register now for the FREE Livestream!**

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.