Coding issue about nonnegative parameter estimation

Marx · Posted 10-13-2016 08:50 AM

Dear SAS/IML experts,

I would like to estimate parameters (x[1], x[2], x[3], x[4]) in the cost function (This is a minimization problem).

obj=sum( ((x[1]/2)*(M1-G1)^2 + (x[2]/2)*(M2-G2)^2 + (x[3]/2)*(1-EQ)^2))*EXP(-x[4]*T));

I have several contraints in the cost function.

1. (M1-G1)+ =max[0, (M1-G1)]

2. (M2-G2)+ =max[0, (M2-G2)]

3. (1-EQ)+ =max[0, (1-EQ)]

4. x[1], x[2], x[3], x[4] >0

I would like to know the coding below is correct or not. After I ran SAS, I cannot see any numbers for the parameters.

I think the coding has some problems. Also, I would like to know how to add three constraints (1, 2, and 3) above.

Thank you very much in advance.

Sincerley yours,

J1

data have;

infile cards truncover expandtabs;

input G1 G2 M1 M2 EQ T;

cards;

5000 4000 4,824 2,461 1.073446328 1990

5000 4000 4,822 2,585 1.06741573 1991

5000 4000 4,911 2,696 1.06741573 1992

5000 4000 5,033 2,879 1.06442577 1993

5000 4000 5,098 3,058 1.058495822 1994

5000 4000 5,138 3,320 1.052631579 1995

5000 4000 5,261 3,463 1.046831956 1996

5000 4000 5,375 3,470 1.043956044 1997

5000 4000 5,411 3,324 1.035422343 1998

5000 4000 5,510 3,318 1.032608696 1999

5000 4000 5,702 3,405 1.027027027 2000

5000 4000 5,601 3,488 1.02425876 2001

5000 4000 5,649 3,694 1.018766756 2002

5000 4000 5,679 4,525 1.010638298 2003

5000 4000 5,763 5,288 1.00795756 2004

5000 4000 5,795 5,790 1 2005

5000 4000 5,704 6,414 0.994764398 2006

5000 4000 5,795 6,792 0.989583333 2007

5000 4000 5,622 7,035 0.984455959 2008

5000 4000 5,274 7,692 0.981912145 2009

5000 4000 5,409 8,257 0.974358974 2010

;

run;

proc iml;

use have;

read all var {G1 G2 M1 M2 EQ T};

close;

start func(x) global(G1, G2, M1, M2, EQ T);

obj=sum( ((x[1]/2)#(M1-G1)##2 + (x[2]/2)#(M2-G2)##2 + (x[3]/2)#(1-EQ)##2))*EXP(-x[4]*T));

return (obj);

finish;

start C_UC2D(x);

c = j(4,1,0);

c[1]=1-x[1];

c[2]=1-x[2];

c[3]=1-x[3];

c[4]=1-x[4];

return(c);

finish;

call randseed(1234567);

x = j(10,4,.);

temp=j(1,4,.);

opt = j(1, 11, .);

opt[1]=1;opt[2] = 0; opt[10] = 4;

do i=1 to nrow(x);

call randgen(temp,'uniform');

CALL NLPNMS(rc,xres,"func",temp,opt, , , , , "C_UC2D");

x[i,]=xres;

end;

create want from x[c=('x1':'x4')];

append from x;

close;

quit;

proc print noobs;run;

proc means data=want mean stderr ;

run;

Rick_SAS · Posted 10-13-2016 10:25 AM

There are a lot of problems here, so I'll have to be brief. I encourage you to look at the article "Maximum likelihood estimation in SAS/IML" and the IML documentation about how to solve constrained optimizations.

Here are some commetns:

1. The paramters are x[1] through x[4]. The data is considered constant. You can set constraints on the parameters, but not on the data. You need to clean the data to make sure that it makes sense.

2. There are problems in your DATA step

3. You have mismatched parentheses in the definition of the objective function. Does the expoential term multiply the sum of the other terms or only the last term?

4. You need to rescale the T variable, perhaps by setting T = T - 1990. The expression EXP(-x[4]*T) will be zero unless you rescale.

5. You have LINEAR constraints on the boundary of the parameter space. therefore you do not need the C_UC2D function. Instead, use a 2x4 matrix in which the first row specifies the lower constraints and the second row contains missing values to indicates that there are not upper bound constraints.

6. Since you want to minimize, set opt[1]=0.

Here's a start. You need to decide what to do about (3) and (4).

data have;
infile cards truncover expandtabs;
input G1 G2 M1 M2 EQ T;
cards;
5000  4000  4824 2461 1.073446328 1990
5000  4000  4822 2585 1.06741573  1991
5000  4000  4911 2696 1.06741573  1992
5000  4000  5033 2879 1.06442577  1993
5000  4000  5098 3058 1.058495822 1994
5000  4000  5138 3320 1.052631579 1995
5000  4000  5261 3463 1.046831956 1996
5000  4000  5375 3470 1.043956044 1997
5000  4000  5411 3324 1.035422343 1998
5000  4000  5510 3318 1.032608696 1999
5000  4000  5702 3405 1.027027027 2000
5000  4000  5601 3488 1.02425876  2001
5000  4000  5649 3694 1.018766756 2002
5000  4000  5679 4525 1.010638298 2003
5000  4000  5763 5288 1.00795756  2004
5000  4000  5795 5790 1           2005
5000  4000  5704 6414 0.994764398 2006
5000  4000  5795 6792 0.989583333 2007
5000  4000  5622 7035 0.984455959 2008
5000  4000  5274 7692 0.981912145 2009
5000  4000  5409 8257 0.974358974 2010
;
run;

proc iml;
use have;
read all var {G1 G2 M1 M2 EQ T};
close;

T = T - min(T);  /* start time at 1990??? */

start func(x) global(G1, G2, M1, M2, EQ, T);
 /* ??? Which terms does EXP term multiply? */
 v = ( x[1]/2#(M1-G1)##2 + x[2]/2#(M2-G2)##2 + x[3]/2#(1-EQ)##2 ) # EXP(-x[4]*T);
 obj=sum( v );
 return (obj);
finish;

/* test objective func */
x = j(4,1,1);
LL = func(x);
print LL;

opt = j(1, 11, .);
opt[1]=0;opt[2] = 4;

/*    constraint matrix */
con = { 1e-16 1e-16 1e-16 1e-16,  /* lower bounds above 0 */
        .     .     .     .};     /* no constraints on upper bounds */
CALL NLPNMS(rc,xres,"func", x,opt, con);
print rc xres;

Marx · Posted 10-14-2016 04:26 AM

Dear Rick,

Thank you very much for your response.

For 3, I attahced the objective function. Each variable has time variable. For example, M1 => M1(T)

If I want to estimate the parameter values for this objective function, then I would like to know how to make a coding in SAS.

For 4, I agree on your opition. In this case, I would like to change T. T= T-1990.

I would like to know if we can estimate parameter values using quasi-full-information-maximum-likelihood with SAS/IML or not.

Sincerely yours,

J1

Ksharp · Posted 10-14-2016 02:38 AM

I don't understand your constrained condition:

1. (M1-G1)+ =max[0, (M1-G1)]

2. (M2-G2)+ =max[0, (M2-G2)]

3. (1-EQ)+ =max[0, (1-EQ)]

What it is '+=' ?

Ksharp · Posted 10-14-2016 04:01 AM

Why use SUM in object function ,since each row corresponding to a solution ?

And assuming constrained condition is :

1. (M1-G1) =max[0, (M1-G1)]

2. (M2-G2)=max[0, (M2-G2)]

3. (1-EQ) =max[0, (1-EQ)]



data have;
infile cards truncover expandtabs;
input G1 G2 M1 : comma32. M2 : comma32. EQ T;
cards;
5000  4000  4,824 2,461 1.073446328 1990
5000  4000  4,822 2,585 1.06741573  1991
5000  4000  4,911 2,696 1.06741573  1992
5000  4000  5,033 2,879 1.06442577  1993
5000  4000  5,098 3,058 1.058495822 1994
5000  4000  5,138 3,320 1.052631579 1995
5000  4000  5,261 3,463 1.046831956 1996
5000  4000  5,375 3,470 1.043956044 1997
5000  4000  5,411 3,324 1.035422343 1998
5000  4000  5,510 3,318 1.032608696 1999
5000  4000  5,702 3,405 1.027027027 2000
5000  4000  5,601 3,488 1.02425876  2001
5000  4000  5,649 3,694 1.018766756 2002
5000  4000  5,679 4,525 1.010638298 2003
5000  4000  5,763 5,288 1.00795756  2004
5000  4000  5,795 5,790 1           2005
5000  4000  5,704 6,414 0.994764398 2006
5000  4000  5,795 6,792 0.989583333 2007
5000  4000  5,622 7,035 0.984455959 2008
5000  4000  5,274 7,692 0.981912145 2009
5000  4000  5,409 8,257 0.974358974 2010
;
run;
 
proc iml;
use have nobs nobs;
read all var {G1 G2 M1 M2 EQ T};
close;
 
start func(x) global(M1_G1,M2_G2,_1_EQ,_T);
 obj=((x[1]/2)#M1_G1##2 + (x[2]/2)#M2_G2##2 + (x[3]/2)#_1_EQ##2)*EXP(-x[4]*_T);
 return (obj);
finish;


/*number of Xi*/
n=4; 
/* constraint conditions - 0<=x1,0<=x2,...*/
con=repeat({0,.},1,n);
/* initial value of x1,x2,...*/
x=j(1,n,1/n);
/* minimize object function value*/
optn={0 0};

solution=j(nobs,n+1,.);
do i=1 to nobs;
M1_G1=max(0,(M1[i]-G1[i]));
M2_G2=max(0,(M2[i]-G2[i]));
_1_EQ=max(0,(1-EQ[i]));
_T=T[i]; 
 /* xres is a solution if rc>0 */
 call nlpnra(rc,xres,"func",x,optn,con);
solution[i,1]=rc;
solution[i,2:n+1]=xres;
end;

vname={return_code}||('x1':'x'+char(n));
create want from solution[c=vname];
 append from solution;
 close;
quit;
title 'Xi is a solution if return code > 0';
proc print noobs;run;

proc means data=want mean stderr ;
run;

Marx · Posted 10-14-2016 04:41 AM

Dear Ksharp,

Thank you very much for your coding.

I attahced the objective function here. Each variable has time variable (e.g. M1 => M1(T)).

For this objective function, only positive differences are considered. (M1-G1)+=max[0, (M1-G1)]

I saw the results about the parameter values are the same. It does not make sense to me. If I want to add one more constraint x[1] > x[2], then I would like to know how to add this constraint.

Sincerely yours,

J1

Rick_SAS · Posted 10-14-2016 08:57 AM

> If I want to add one more constraint x[1] > x[2], then I would like to know how to add this constraint.

To add additional constraints, you would add additional rows to the constraint matrix, as explained in the documentation. For example, x[1] > x[2] is coded as


/*    constraint matrix */
con = { 1e-16 1e-16 1e-16 1e-16 . .,  /* lower bounds above 0 */
        .     .     .     .     . .,  /* no constraints on upper bounds */
        1     -1     .    .     1 1e-16};  /* x1 - x2 > 1e-16 */

The notation (M-G)+ is a standard mathematical definition, but I don't understand why it is relevant. That function does not seem to be part of the objective function.

I think you need to understand your problem a little better before you try to optimize. For example, the code you sent optimizes the integrand in the PDF file, but the formula in the PDF file indicates that problem is to minimize a definite integral over possible values of G1_bar and G2_bar. Which is it?

Are the parameters (nu1, nu2, xi, delta3) and the objective function the integrand?
Are the parameters (G1_bar, G2_bar) and the objective function the integral? In that case, the objective function you wrote is wrong because you need to integrate the EXP term w/r/t t and evaluate at t1 and t0.

Marx · Posted 10-15-2016 02:10 AM

Dear Rick,

"The notation (M-G)+ is a standard mathematical definition, but I don't understand why it is relevant. That function does not seem to be part of the objective function."

My objective function is the social cost function abouth emissions.

M1 is the actural emissions, and G1 is the limits of emissions.

nu1*(M1-G1)^2/2 is the abatemet cost function for the emissions. In this case, it was assumed that it is cosidered only positive difference between M1 and G1. So, for the objective function, I added (M1-G1)+ = max (0, M1-G1).

1. The parameters (nu1, nu2, xi, delta3) are fixed for the objective function.

2. The objective function is

(nu1*(M1(1990)-G1(1990))^2/2 + nu2*(M2(1990)-G2(1990))^2/2 + xi*(1-EQ(1990))^2/2)*EXP(-delta3*(1990))

+(nu1*(M1(1991)-G1(1991))^2/2 + nu2*(M2(1991)-G2(1991))^2/2 + xi*(1-EQ(1991))^2/2)*EXP(-delta3*(1991))

+...

+(nu1*(M1(2000)-G1(2000))^2/2 + nu2*(M2(2000)-G2(2000))^2/2 + xi*(1-EQ(2000))^2/2)*EXP(-delta3*(2000))

FYI, I am studying about the optimal control probelm. This objective function has many constrains (nonlinear) including state equations.

I need appropriate parameter values to solve the optimal control problem.

I would like to know if this way to get the parameter values for the optimal control problem using SAS/IML is correct or not.

Sincerey yours,

J1

Ksharp · Posted 10-15-2016 02:40 AM

I saw a problem.

EXP(-delta3*(1990)) will be near zero due to 1990. So any of them multiply it will be zero, your obj value will always be zero.

I also noticed in PDF t is continuous variable, but in your real data t is discrete, I think you should not include t in your obj function.

Ksharp · Posted 10-14-2016 11:53 PM

OK. I remember it is untility function.

Rick make a lot of sense. I think either your objection is not right or you need more constrain condition.

The output is also not look right.



data have;
infile cards truncover expandtabs;
input G1 G2 M1 : comma32. M2 : comma32. EQ T;
cards;
5000  4000  4,824 2,461 1.073446328 1990
5000  4000  4,822 2,585 1.06741573  1991
5000  4000  4,911 2,696 1.06741573  1992
5000  4000  5,033 2,879 1.06442577  1993
5000  4000  5,098 3,058 1.058495822 1994
5000  4000  5,138 3,320 1.052631579 1995
5000  4000  5,261 3,463 1.046831956 1996
5000  4000  5,375 3,470 1.043956044 1997
5000  4000  5,411 3,324 1.035422343 1998
5000  4000  5,510 3,318 1.032608696 1999
5000  4000  5,702 3,405 1.027027027 2000
5000  4000  5,601 3,488 1.02425876  2001
5000  4000  5,649 3,694 1.018766756 2002
5000  4000  5,679 4,525 1.010638298 2003
5000  4000  5,763 5,288 1.00795756  2004
5000  4000  5,795 5,790 1           2005
5000  4000  5,704 6,414 0.994764398 2006
5000  4000  5,795 6,792 0.989583333 2007
5000  4000  5,622 7,035 0.984455959 2008
5000  4000  5,274 7,692 0.981912145 2009
5000  4000  5,409 8,257 0.974358974 2010
;
run;
 
proc iml;
use have nobs nobs;
read all var {G1 G2 M1 M2 EQ T};
close;
 
start func(x) global(G1, G2, M1, M2, EQ, T); 
 obj=sum(  ( x[1]#((M1-G1)<>0)##2 + x[2]#((M2-G2)<>0)##2 + x[3]#((1-EQ)<>0)##2 )#EXP(-x[4]#T) );
 return (obj);
finish;


/*number of Xi*/
n=4; 
/* constraint conditions - 0<=x1,0<=x2,...*/
con={0  0 0 0 . .,
     .  . . . . .,
     1 -1 . . 1 0};
/* minimize object function value*/
optn={0 0};
call randseed(1234567890);
solution=j(100,n+1,.); /*Change 100 to get more solution*/
do i=1 to 100; /*Change 100 to get more solution*/
/* initial value of x1,x2,...*/
x=j(1,n,.);
call randgen(x,'uniform'); 
 /* xres is a solution if rc>0 */
 call nlpnra(rc,xres,"func",x,optn,con);
solution[i,1]=rc;
solution[i,2:n+1]=xres;

/*Check initial value and solution*/
print x[l='Initial Value'],xres[l='Solution'];
end;

vname={return_code}||('x1':'x'+char(n));
create want from solution[c=vname];
 append from solution;
 close;
quit;
title 'Xi is a solution if return code > 0';
proc print noobs;run;

proc means data=want mean stderr ;
run;

Marx · Posted 02-12-2017 10:10 AM

Dear Ksharp,

I asked about coding below last year.

If someone asked me where the detailed results (parameter estimation) can be found, then I would like to how to explain this verbally.

"The parameters in the social cost function is estimated using nonlinear optimazation by Newton-Raphson method in SAS/IML. Parameters are estimated by maximum likelihood using a Newton-Raphson algorithm. Using the sample normal approximation, standard errors of the parameter esimates are estimated from the inverse of the observed information matrix."

I would like to know this explanation is correct or not.

Thank you very much in advance.

Sincerely yours,

J1

Ksharp · Posted 02-12-2017 09:24 PM

Sorry. I think that didn't describe correctly.
Should like:
The parameters in the social cost function is estimated using nonlinear optimazation by Newton-Raphson method in SAS/IML. Parameters are estimated by  minimizing object function(cost function) using a Newton-Raphson algorithm. Using a lot of different initial values (x) which are used for Newton-Raphson method, standard errors of the parameter esimates are estimated from calculating the std of all these solutions.

Marx · Posted 02-14-2017 12:54 AM

Dear Ksharp,

Thank you very much for your help.

Sincerely yours,

J1

Coding issue about nonnegative parameter estimation

Re: Coding issue about nonnegative parameter estimation

Re: Coding issue about nonnegative parameter estimation

Re: Coding issue about nonnegative parameter estimation

Re: Coding issue about nonnegative parameter estimation

Re: Coding issue about nonnegative parameter estimation

Re: Coding issue about nonnegative parameter estimation

Re: Coding issue about nonnegative parameter estimation

Re: Coding issue about nonnegative parameter estimation

Re: Coding issue about nonnegative parameter estimation

Re: Coding issue about nonnegative parameter estimation

Re: Coding issue about nonnegative parameter estimation

Re: Coding issue about nonnegative parameter estimation

Ready to join fellow brilliant minds for the SAS Hackathon?