Re: Optimizing a macro output

vioravis · Posted 05-04-2012 07:37 PM

I have both SAS/OR and SAS/IML and would like to perform nonlinear optimization on the values returned by a macro. This macro takes A and B as input and returns a result specific for the A and B values. There are also lower and upper bounds on both A and B, I have given a pseudo code that explains this:

MACRO returnResult(A, B)

{

...................................

result = function (A, B)

}

The objective function to minimized is the "result" returned by the macro with upper and lower bounds on A and B. Can some please help how we can set it up as a nonlinear optimization problem using SAS/OR or SAS/IML?

Thank you.

Ravi

vioravis · Posted 05-06-2012 03:05 PM

Specifically, if I have a macro that contains the following data steps that calculates the objective function, is it possible to use them with in either PROC IML or OPTMODEL (I am new to both). Any help would be appreciated. Thank you.

/*create some test data*/

DATA test (drop=i);

input X Y Z;

do i=1 to 2000;

output;

end;

cards;

1 10 0

2 20 0

3 30 0

4 40 0

5 50 0

;

data _null_;

if 0 then set test nobs=nobs;

CALL SYMPUT('NUMREC',nobs); /*** put # of records into NUMREC macro var ***/

stop; /*** stop, got number of records ***/

run;

/* now read values of x and y into arrays, and then

reread test and do calculations*/

data want (drop=_:);

array _xval(&numrec); /* create two arrays with same number of

elements as there are records in test

*/

array _yval(&numrec);

i=0;

do until (eof1); /* load the array with x and y values */

set test end=eof1;

i+1;

_xval(i)=x;

_yval(i)=y;

end;

_rec=-1;

do until (eof2); /* read in each record separately */

set test end=eof2;

_rec+1;

z=0;

do _i=1 to _rec;

z+_xval(_i)*_yval(&numrec-_rec+_i);

end;

output;

end;

run;

Ravi

RobPratt · Posted 05-07-2012 05:26 PM

You can do something like the following in OPTMODEL:

proc optmodel;

set OBS;

num x {OBS};

num y {OBS};

num n = card(OBS);

read data test into OBS=[_N_] x y;

num z {j in OBS} = sum {i in 1..j-1} x * y[n-j+i];

create data want(drop=i) from x y z;

quit;

But that just calculates z from the given values of x and y. Are x and y intended to be the decision variables in your optimization problem? Do you have any constraints on x and y? And do you want to minimize or maximize z?

vioravis · Posted 05-07-2012 05:49 PM

Hi Rob,

I am trying to do optimization of a two variable (x and y) macro function. The macro takes as input x and y and returns the corresponding sum of squared errors which needs to be minimized. There are upper and lower bounds to both x and y. The macro is a fairly complex one with several data steps (I can include the entire macro within in optimization procedure as I have done below). i would like to use a quasi newton approach to determine the optimal values of x and y that minimizes the objective function. I have both SAS/OR and SAS/IML but have difficulty finding out the right approach for solving this problem. I tried using SAS/IML as follows using a dummy objective function calculation:

proc iml;

start F_ROSEN(x);

submit;

data test;

input x;

datalines;

1.2

0.3

10

7

2.9

;

run;

PROC SQL NOPRINT;

SELECT SUM(x) into :SUMM

FROM test;

QUIT;

endsubmit;

newSum = &SUMM;

y1 = 10 * (x[2] - x[1] * x[1]);

y2 = 1 - x[1];

f = 0.5 * (y1 * y1 + y2 * y2)+newSum;

return(f);

finish F_ROSEN;

x = {-1.2 1 1};

optn = {0 2 . 2};

call nlpqn(rc,xr,"F_ROSEN",x,optn);

quit;

However, the problem with this approach is that, I believe, the function f needs to be defined in terms of the decision variables (x and y). However, in my case it is just the sum of squared errors corresponding to a particular value of x and y .i.e. f = &SUMM;

Please let me know if this can be handled in SAS.

Thank you.

Ravi

RobPratt · Posted 05-07-2012 06:05 PM

Yes, your objective function should be expressed in terms of your decision variables, as in this "Getting Started" example:

http://support.sas.com/documentation/cdl/en/ormpug/63975/HTML/default/viewer.htm#ormpug_optmodel_sec...

But it is still not clear to me whether your x and y are decision variables or input data. Maybe it would help to describe your optimization problem mathematically, without code. What is your input, and what is your desired output?

vioravis · Posted 05-08-2012 03:08 AM

Hi Rob,

I have given a simple example of what I am trying to do. In the following macro, x1 and x2 are the decision variables and SSE is the objective function that is to be minimized. (The one I have given is fairly simple and the actual code has far more calculations to arrive at SSE). SSE = f(x1, x2) is the objective function but it is not possible to write out the calculation since calculations are not straight forward.

data test;

input hours actualRisk;

datalines;

1.2 0.05

0.3 0.008

10 0.5

7 0.09

2.9 0.15

3 0.1

7.5 0.25

8 0.20

3.5 0.1

1.7 0.15

;

run;

%MACRO generateSSE(x1, x2);

DATA test;

SET test;

calculatedRisk = CDF('WEIBULL',hours,&x1,&x2);

RUN;

data test;

set test;

squaredErrors = (actualRisk - calculatedRisk)*(actualRisk - calculatedRisk);

run;

PROC SQL NOPRINT;

SELECT SUM(squaredErrors) into :SSE

FROM test;

QUIT;

%PUT &SSE;

%MEND generateSSE;

%generateSSE(1,2);

Could you please let me know how to pass generateSSE to an optimizer to find the values of x1 and x2 that minimizes SSE?

I have implemented the same using the optim function in R and have given the code below:

hours <- c(1.2, 0.3, 10, 7, 2.9, 3, 7.5, 8, 3.5, 1.7)
actualRisk <- c(0.05, 0.008, 0.5, 0.09, 0.15, 0.1, 0.25, 0.20, 0.1, 0.15)

test <- data.frame(hours = hours, actualRisk = actualRisk)

generateSSE <- function(x)
{

test$calculatedRisk <- dweibull(test$hours, shape = x[1], scale = x[2])
test$squaredErrors <- (test$actualRisk - test$calculatedRisk)^2
return(sum(test$squaredErrors))
}

optout=optim(c(1,2), generateSSE, NULL, method = "L-BFGS-B",
lower=rep(0.4,0.05), upper=rep(8, 1000),control=list(reltol=0.01))

I would like to implement a similar solution in SAS.

Thank you.

Ravi

RobPratt · Posted 05-08-2012 09:54 AM

Thanks for the clarification. If I understand your example correctly, the following code solves the problem in one PROC OPTMODEL call, without using any macros. Maybe your more complicated objective can be handled similarly.

data bounds;
    input lower upper;
    datalines;
0.4     8
0.05 1000
;

proc optmodel;
    set OBS;
    num hours {OBS};
    num actualRisk {OBS};
    read data test into OBS=[_N_] hours actualRisk;

    set VARS;
    var x {VARS};
    read data bounds into VARS=[_N_] x.lb=lower x.ub=upper;
    impvar calculatedRisk {i in OBS} = CDF('WEIBULL',hours,x[1],x[2]);
    min SSE = sum {i in OBS} (actualRisk - calculatedRisk)^2;

    solve;
    print x;
    print actualRisk calculatedRisk;
    create data x_sol from [var] x;
quit;

vioravis · Posted 05-08-2012 10:21 AM

Thanks a lot Rob for your solution. I will take a look at it and see whether we could rewrite the entire objective function the same way.

A quick question: Is it possible to use call PROC IML within PROC OPTMODEL since objective function calculation involves matrix multiplications???

Thank you.

Ravi

RobPratt · Posted 05-08-2012 11:22 AM

You cannot currently call PROC IML from within PROC OPTMODEL. In release 12.1 due out later this year, you can use a SUBMIT block to call generic SAS code from within PROC OPTMODEL. But even that functionality will not enable you to implement a black-box objective function. Instead, you can indicate matrix multiplication by using the SUM operator:

c[i,j] = sum {k in 1..n} a[i,k] * b[k,j]

vioravis · Posted 05-08-2012 02:01 PM

Thanks, Rob for your help.

vioravis · Posted 05-08-2012 04:32 PM

Hi Rob,

I am trying to program the whole objective function and need help in the following:

1. How do I create the reverse of "impvar calculatedRisk {i in OBS} = CDF('WEIBULL',hours,x[1],x[2])" ?? I want this variable or array the same elements as calculatedRisk but in a reverse order???

2. Can I create another variable newRisk such that newRisk[1] = calculatedRisk[1] and all the other values of newRisk[] are zero.

3. Can I multiply the individual elements of calculatedRisk and newRisk??? First element of one with the last element of the other and so on.

Please let me know if these are possible.

Thank you.

Ravi

RobPratt · Posted 05-08-2012 06:17 PM

Yes, you can do all of these things, but it will likely be inefficient to introduce new variables whose values are equal to existing variables. Instead, you can express everything in terms of calculatedRisk for different values of i. For example, to achieve #1, you can do the following:

num n = card(OBS);

impvar calculatedRiskReverse {i in OBS} = calculatedRisk[n-i+1];

But you can avoid introducing a new variable by just using calculatedRisk[n-i+1] everywhere you need it:

for {i in OBS} put calculatedRisk[n-i+1]=;

For #2:

impvar newRisk {i in OBS} = (if i = 1 then calculatedRisk[1] else 0);

For #3:

calculatedRisk * newRisk[n-i+1]

Again, I think you will be better off not introducing additional variables unless you need to. If you write out the full expression for your objective, I suspect you will find that you can eliminate the additional variables by using logical conditions in the SUM operator.

vioravis · Posted 05-09-2012 02:06 PM

Hi Rob,

Thanks a lot for your help on those questions. As I am coding up the remaining objective function, I have a couple of more questions:

1. I want to implement the following for loop

for (i in 2:n))
{

updatedRisk1= sum(calculatedRisk[1:i-1]*newRisk[(n-i+2):n])
}

I tried using the DO loop as follows.

impvar updatedRisk {i in OBS} = 0;

num j;

do j = 2 to n;

updatedRisk = sum(calculatedRisk[1:j-1]*newRisk[(n-j+2😞n]);

end;

However, neither the impvar statement nor the do loop is working as expected. Could you please help me with the right approach for this???

2. I am trying to do interpolation on a variable as follows:

data input;

input var1 value;

datalines;

1 5

2 10

3 15

4 20

5 25

;

RUN;

Based on the value of var1, I am trying to interpolate for var2 as given below (these are not separate datasets but just the variables created within OPTMODEL. I have variables var1, var2 and value created using IMPVAR and would like to impute "value" for var2.)

data output;

input var2 interpolatedValue;

datalines;

1.5 7.5
2.5 12.5
3.5 17.5
4.5 22.5

;

RUN;

Even if linear interpolation is not possible, the nearest approximation is sufficient as in the following R code using approx function:

yinterp=approx(var1,value,xout=var2)

Please let me know if there is a way to implement this.

Thanks again.

Ravi

RobPratt · Posted 05-09-2012 02:51 PM

I will reply to #1 first and #2 in a separate post.

If you want to use IMPVAR, you have to include the formula in the declaration. You cannot update it later. The other problem is that the SUM operator has different syntax than what you tried. Here's what I think you wanted (a dot product of two vectors):

impvar updatedRisk {i in 2..n} = sum {j in 1..i-1} calculatedRisk * newRisk[n-i+1+j];

RobPratt · Posted 05-09-2012 03:04 PM

Regarding linear interpolation, PROC OPTMODEL does not have anything built in. But you can try some of the ideas suggested in this blog posting:

http://blogs.sas.com/content/iml/2012/03/16/linear-interpolation-in-sas/

and this Usage Note:

http://support.sas.com/kb/24/560.html

You can do the interpolation outside of OPTMODEL and then read the results into an OPTMODEL array.

Ready to join fellow brilliant minds for the SAS Hackathon?