The code below is for finding maximum likelihood estimates "beta0_hat, beta1_hat, and tau_hat" for GEV regression. I get the following errors for all replicates: "ERROR: QUANEW Optimization cannot be completed. WARNING: Optimization routine cannot improve the function value."
How can the code be fixed to satisfy convergence and get the corrected MLEs?
options nonotes;
proc iml;
start Clip(x, ab);
a = ab[1];
b = ab[2]; /* Ensure ab is a 1x2 vector {min max} */
return( a <> (x >< b) ); /* Elementwise min/max */
finish;
/*Create module to find and return sum of log pmf of Bernoulli dist*/
start SUMLOGLHF(beta) global(n,x,y);
**sum=0;
/* Extract scalar parameters from the input vector */
beta0_star = beta[1];
beta1_star = beta[2];
xi_star= beta[3];
log_sum=0;
/* Calculate linear predictor for all observations */
eta = x * (beta[1:2]); /* t(beta[1:2]) is row vector 1 by 2*/
do j=1 to nrow(y);
z = -eta[j];
z = Clip(z, {-30 5});
if xi_star=0 then p = exp(-exp(-z));
else if xi_star^=0&(xi_star*z)>(-1) then p=exp(-((1+(xi_star*z))**(-1/xi_star)));
else if xi_star>0 & z<= (-1/xi_star) then p=0;
else if xi_star<0 & z>= (1/abs(xi_star)) then p=1;
else p=.;
If p=0 then p_adj=1-0.000000001;
If p=1 then p_adj=1-0.999999999;
else p_adj=1-p;
/* Bound probability for numerical stability */
** p_adj = max(1e-10, min(1-1e-10, p));
log_sum=log_sum+logpdf("Bernoulli",y[j,1],p_adj);
end;
return (-log_sum);/* This module is intended to be a function that returns a value, a RETURN statement is included to specify the value*/
finish;
start driv(beta) global(n, x, y);
beta0_star = beta[1];
beta1_star = beta[2];
xi_star= beta[3];
eta = x * (beta[1:2]);
z = -eta;
/* Applying clipping for stability in gradient calculation */
z_adj = Clip(z, {-30 5});
p = j(nrow(z_adj), 1, 0); /* Initialize p */
safe_val = j(nrow(z_adj), 1, 1); /* Default for xi=0 */
val = 1 + xi_star # z_adj;
safe_val = Clip(val, {1e-10 1e10});
if xi_star=0 then p = exp(-exp(-z_adj));
else do;
p = exp((-(safe_val) ## (-1/xi_star)));
if xi_star > 0 then p = choose(z_adj <= (-1/xi_star), 0, p);
else p = choose(z_adj >= (1/abs(xi_star)), 1, p);
end;
p_adj = max(1e-10, min(1-1e-10, p));
/* Use safe_val in derivative calculations */
common = ((log(p_adj))/(safe_val + 1e-10)) # ((y - p_adj) / (1 - p_adj));
g = j(1, 3, 0);
g[1] = sum(common);
g[2] = sum(common#x[,2]);
xi_term = ((log(safe_val + 1e-10))/((xi_star+ 1e-10)**2))-(z/((xi_star + 1e-10) # safe_val));
g[3] = sum(xi_term #(log(p_adj))#((y-p_adj)/(1-p_adj)));
return(-g);
finish driv;
replicate=1000;/*1000 replications*/
n=25;/*Sample size for each replicate is 25*/
Est=j(replicate,3,0);/*Matrix that includes the estimates of beta_0 and beta_1*/
colMean=j(3,1,0);/*Vector that includes the mean values of the estimates of beta_0, beta_1, and xi*/
colStanddev=j(3,1,0);/*Vector that includes the standard deviation values of the estimates of beta_0, beta_1, and xi*/
/*Do loop for 1000 times*/
do k=1 to replicate;
x1=j(1,n);
x1t=t(x1);
call randseed(3205+k);/*Change seed each iteration by adding k to avoid getting the same result 1000 times*/
Call randgen(x1t, "NORMAL", 0, 1); /*Generates N(0,1)*/
x2=j(n,1,1);
x=x2||x1t;/*x is the design matrix*/
y=j(n,1);/*y is the vector that includes the generated y data*/
beta0=1;
beta1=3;
xi=0.3;
/*calculate prob which is the cdf of GEV and generate y data based on prob*/
do j=1 to n;
/*using neg_eta_true in the data generation versus "-eta[j]*/
eta_true=beta0+beta1*x1t[j,1];/*calculate eta inside the function*/
neg_eta_true=-eta_true;/*Calculate negative eta*/
if xi=0 then p = exp(-exp(-neg_eta_true));
else if xi^=0&(xi*neg_eta_true)>(-1) then p=exp(-((1+(xi*neg_eta_true))**(-1/xi)));
else if xi>0 & neg_eta_true<= (-1/xi) then p=0;
else if xi<0 & neg_eta_true>= (1/abs(xi)) then p=1;
else p=.;
If p=0 then p_true=1-0.000000001;
If p=1 then p_true=1-0.999999999;
else p_true=1-p;
/* Bound p to avoid randgen errors */
p_true = max(1e-10, min(1-1e-10, p));
Call randgen(m,"Bernoulli",p_true); /* Generates Bernoulli dist. for Y generated data and probability of success =p_adj*/
y[j,1]=m;
end;
sumy=sum(y);
print sumy;
initial_param = {0 1 0}; /* intial_beta0=0, intial_beta1=1, initial_xi=0 */
**initial_param={0 1};
opt = j(1, 10, .);
opt[1] = 0; /* Minimize */
opt[2] = 1; /* Change to 1 or 3 to print messages to log if needed */
opt[5] = 1000; /* Increased Max Iterations */
opt[6] = 2000; /* Increased Max Function Calls */
/* Add con before call to enforce "xi">-0.5(or higher) to prevent numerical collapse,as the distribution becomes undefined for very negative*/
con = { . . -0.5, /* lb: b0, b1, xi */
. . 1.0 }; /* ub: b0, b1, xi */
/* Call Optimizer */
call nlpnra(rc, res, "SUMLOGLHF", initial_param,opt, con) grd="driv";
Est[k,]=res;/*Matrix that includes 1000 estimates of beta_0_hat, beta_1_ hat, and shape parameter_hat*/
end;
*print Est;
/*Calculate mean based on MLEs for beta_0 and beta_1 and also bias*/
colMean=t(mean(Est));
print colMean;
quit;
options notes;
The first derivative formulas are attached.
I haven't checked your code on this Easter Monday, but I wanted to make you aware about this blog :
Implement the generalized extreme value distribution in SAS
By Rick Wicklin on The DO Loop August 25, 2025
https://blogs.sas.com/content/iml/2025/08/25/gev-sas.html
... and this SAS Communities Library Article :
Extreme Value Theory for Modern Risk Management Systems
Last update: 2 weeks ago
Updated by: SAS Employee sunilbhardwaj
https://communities.sas.com/t5/SAS-Communities-Library/Extreme-Value-Theory-for-Modern-Risk-Manageme...
BR, Koen
opt[5] = 1000; /* Increased Max Iterations */
opt[6] = 2000; /* Increased Max Function Calls */
The max number of iterations and function calls are not set through opt but through a separate tc argument. Try
tc = j(1, 13, .); tc[1] = 1000; /* Increased Max Iterations */ tc[2] = 2000; /* Increased Max Function Calls */ /* Call Optimizer */ call nlpnra(rc, res, "SUMLOGLHF", initial_param, opt, con, tc) grd="driv";
See the official documentation for details:
SAS Help Center: Options Vector
SAS Help Center: Termination Criteria
Thank you for your response.
I implemented the suggested code:
tc = j(1, 13, .);
tc[1] = 1000; /* Increased Max Iterations */
tc[2] = 2000; /* Increased Max Function Calls */
/* Call Optimizer */
call nlpnra(rc, res, "SUMLOGLHF", initial_param, opt, con, tc) grd="driv";
I still get the same error:
ERROR: NEWRAP Optimization cannot be completed.
WARNING: Optimization routine cannot improve the function value.
How can the code be fixed to satisfy convergence and get the corrected MLEs?
> I still get the same error:
ERROR: NEWRAP Optimization cannot be completed.
WARNING: Optimization routine cannot improve the function value.
A more accurate statement is, "For about 10% of the random samples of size N=25, the optimization does not converge."
This is typical behavior for three-parameter models, especially those with a threshold parameter. In some cases, the log-likelihood does not have a maximum on the feasible domain. The parameter estimates (especially the xi parameter for GEV) are very sensitive to samples whose shapes deviate from the distribution. For example, if the random sample generates too many or too few observations from the tail of the distribution, or few observations close to the threshold, then the MLE might not converge.
You can find many papers and textbooks that discuss these issues. For example, see Rosian, Na, and Gabda (2020), whose abstract begins with the sentence, "The standard method of the maximum likelihood has poor performance in GEV parameter
estimates for small sample data."
I guess the important question to ask is, "what are you trying to accomplish or learn by running this simulation study?"
Is there anything else we can help you with? If not, please close this thread.
Nearly 200 sessions are now available on demand with the SAS Innovate Digital Pass.
Explore Now →