Call NLPQN to maximize likelihood function running out of memory

RobertThompson · Posted 08-23-2019 09:26 AM

I am having memory issues with using call NLPQN to maximize a likelihood function in an IML module. After ~60 iterations, the program stops because it has run out of available memory for symbols. I used the task manager to observe memory usage while the program is running and the program memory usage keeps increasing until about 1900 MB and quits. Could it be that the NLPQN call is creating the new matrices whose symbols are taking up the available space? It seems that as the NLPQN call iterates, the memory is being used up. There are 29 parameters to maximize the likelihood function, and ~4500 vectors (each ~ 1x10) of observations that I am looping over in a kalman filter. Could it be that this number of observations and dimensions of the likelihood is going to need this much memory anyway? Is there any way to fix this issue?

%let mydir 		= E:\FeederCattle;				/* Working directory that contains futures data	from CRB		*/
%let COMMODITY  = FEEDERCATTLE;											/* Commodity to estimate model with								*/
libname mydir "&mydir.";

proc iml; 

use mydir.prices_&COMMODITY;
read all into price_master;

use mydir.ttms_&COMMODITY;
read all into ttm_master;

use mydir.dates_&COMMODITY;
read all into dates_master;

use mydir.other_needed_stuff_&COMMODITY;
read all into other_needed_stuff;

total_obs = other_needed_stuff[1,1];
NNNNN = other_needed_stuff[2,1]; /* shows max number of futures prices used over all days */
call symputx("end",total_obs);
call symputx("numfut",NNNNN);
free / dates_master price_master ttm_master other_needed_stuff;


start liklhd(z) global(log_L, xt_out, price_master, ttm_master, dates_master, other_needed_stuff);

total_obs = other_needed_stuff[1,1];
NNNNN = other_needed_stuff[2,1]; /* shows max number of futures prices used over all days */

dt = (1/365.25);



/*     ****************************************************** Now begin Schwartz and Smith Model *********************************		*/

k = z[1];														/* Parameters to estimate													*/
sigmax = z[2];
lambdax = z[3];
mu = z[4];
sigmae = z[5];
rnmu = z[6];
rho = z[7];  
%macro errors;
%do i = 1 %to &numfut; 
s&i. = z[7+&i.];
%end;
%mend;
%errors;
beta_1 = z[7 + NNNNN + 1];
beta_2 = z[7 + NNNNN + 2];
beta_3 = z[7 + NNNNN + 3];
beta_4 = z[7 + NNNNN + 4];
beta_5 = z[7 + NNNNN + 5];
beta_6 = z[7 + NNNNN + 6];
beta_7 = z[7 + NNNNN + 7];
beta_8 = z[7 + NNNNN + 8];
beta_9 = z[7 + NNNNN + 9];
beta_10 = z[7 + NNNNN + 10];
beta_11 = z[7 + NNNNN + 11];

x0 = {-0.2, 4.8};      												/* Initial state vector [chi_0 , xi_0] 2x1 									*/
C0 = {0.1 0.05, 0.05 0.3};  										/* Initial covariance matrix for the state variables W(t)=cov[xt,et] 2x2	*/
m = nrow(x0);														/* m = Number of state variables (number of rows in x_t) =2					*/
																	/* THE TRANSITION EQUATION 													*/
																	/* NOTATION: x(t)=c+G*x(t-1)+n(t)   n~N(0,W(t)) 							*/
c = j(m,1,0);
c[m,] = mu*dt;
G = i(m);
G[1,1] = exp(-k*dt);

W11=(1-exp(-2*k*dt))*((sigmax)**2)/(2*k); 							/* W is derived from equation (3b) in S&S				 					*/
W12=(1-exp(-k*dt))*((rho*sigmax*sigmae)/k);
W22=((sigmae)**2)*dt;
W=i(m);
W[1,1]=W11; W[1,2]=W12; W[2,1]=W12; W[2,2]=W22;
																	/* THE MEASUREMENT EQUATION 												*/
																	/* NOTATION: y(t)=d(t)+F(t)'x(t)+v(t)     v~N(0,V) 							*/

																	/* Here I make a parameter matrix for each observation. The parameter 		*/
																	/* matrices are identical except for their dimensions. This way, each set 	*/
																	/* of observations has a corresponding parameter matrix in the measurement	*/
																	/* equation that has the correct dimensions									*/


																	/* ----------------KALMAN-------------------*/
																	/* ----------------FILTER-------------------*/
																	/*    I set up a macro to use different 	*/
																	/*  parametermatrices for each observation	*/
xt_1 = x0;
Ct_1 = C0;

xt_out = j(total_obs,m,.);
dQt_1_out = j(total_obs,1,.);
vttFvtt_out = j(total_obs,1,.);
total_futures_prices_used = 0;
do i = 1 to &end;	 
																	/* an observed price vector for each day. that  */
																	/* vector has all futures contracts, of which 	*/
	ttm___ = ttm_master[i,];										/* a perfectly corresponding ttm vector to go 	*/
	ttm_loc = loc(ttm___>=0);										/* with the futures price vector.				*/
	num_futures_ttm = ncol(ttm_loc);

	ttm = j(1,num_futures_ttm,.);	
	do p = 1 to num_futures_ttm;
		temp_ttm = ttm_loc[1,p];
		ttm[1,p] = ttm_master[i,temp_ttm];/**/
	end;

	price = j(1,num_futures_ttm,.);
	do p = 1 to num_futures_ttm;
		temp_price = ttm_loc[1,p];
		price[1,p] = price_master[i,temp_price];/**/
	end;

	dim = ncol(ttm);
	total_futures_prices_used = total_futures_prices_used + dim;

	d = j(dim,1,0); * d is a {N x 1} Vector;
	F = j(dim,m,0); * F is a {N x m} Matrix; 
	season = j(dim,1,.); 
	temp = month(dates_master[i,1] + (ttm*365.25));

	do q=1 to dim;
	    d1    =rnmu*ttm[1,q]-(1-exp(-k*ttm[1,q]))*(lambdax/k);  /* equation (9) in S&S */
	    d2    =(1-exp(-2*k*ttm[1,q]))*((sigmax)**2)/(2*k);
	    d3	  =((sigmae)**2)*ttm[1,q];
	    d4    =2*(1-exp(-k*ttm[1,q]))*((rho*sigmax*sigmae)/k);
 	    d[q,1]=d1+0.5*(d2+d3+d4);

		F[q,1]=exp(-k*ttm[1,q]);
 	    F[q,2]=1;
																/* set up seasonal dummy vectors */
		if temp[1,q] = 1  then do; season[q,1] = beta_1;  end;
		if temp[1,q] = 2  then do; season[q,1] = beta_2;  end;
		if temp[1,q] = 3  then do; season[q,1] = beta_3;  end;
		if temp[1,q] = 4  then do; season[q,1] = beta_4;  end;
		if temp[1,q] = 5  then do; season[q,1] = beta_5;  end;
		if temp[1,q] = 6  then do; season[q,1] = beta_6;  end;
		if temp[1,q] = 7  then do; season[q,1] = beta_7;  end;
		if temp[1,q] = 8  then do; season[q,1] = beta_8;  end;
		if temp[1,q] = 9  then do; season[q,1] = beta_9;  end;
		if temp[1,q] = 10 then do; season[q,1] = beta_10; end;
		if temp[1,q] = 11 then do; season[q,1] = beta_11; end;
		if temp[1,q] = 12 then do; season[q,1] = 0;       end;
	end;

	V = diag(z[8:(7+dim)]);

	yt_t_1 = season + d + F*xt_1;
	yt = t(price);
	vt = yt - yt_t_1;
	Qt_t_1 = F*Ct_1*t(F) + V;
	xtt = xt_1 + Ct_1*t(F)*inv(Qt_t_1)*vt;
	Ctt = Ct_1 - Ct_1*t(F)*inv(Qt_t_1)*F*Ct_1;
	xt = G*xtt + c;
	Ctp1_t = G*Ctt*t(G) + W;
	dQt_1 = det(Qt_t_1);

	xt_out[i,] = t(xt);
	dQt_1_out[i,] = dQt_1;
	vttFvtt_out[i,] = t(vt)*inv(Qt_t_1)*vt;

	xt_1 = xt;
	Ct_1 = Ctp1_t;


end;									
																	/* -------------------------------------------- */
																	/* ----------------LIKELIHOOD------------------ */
																	/* -------------------------------------------- */

pi = constant('pi');
log_L = -((total_futures_prices_used)/2)*log(2*pi)-(1/2)*sum(log(dQt_1_out))-0.5*sum(vttFvtt_out);

free /	log_L xt_out price_master ttm_master dates_master other_needed_stuff z mu;

return(log_L);
finish;

								/* ------------------------------------------ */
								/* -----------------Starting----------------- */
								/* ------------------values------------------ */
								/* ------------------------------------------ */
NNNNNalt = other_needed_stuff[2,1];*ncol(price_master) - min(rowmissing); /* shows max number of futures prices used over all days */
k       	=  1.5844599559;
sigmax  	=  0.1672019558;
lambdax 	=  -0.106474372;
mu      	=  0.0465310811;
sigmae  	=  0.1363141466;
rnmu    	=  -0.049098618;
rho     	=  -0.021670367;
s_guess1 	=  0.0054544214; 
s_guess2 	=  0.0002849884;
s_guess3 	=  2.1902114E-6;
s_guess4 	=  0.0000459330;
s_guess5 	=  0.0000531602;
s_guess6 	=  0.0000328522;
s_guess7 	=  7.1961053E-6;
s_guess8 	=  0.0000386839;
s_guess9 	=  0.0001292197;
s_guess10 	=  0.0002550963;
s_guess11 	=  0.000500589 ;
betaguess1 	=  -0.034657102;
betaguess2 	=  -0.004777719;
betaguess3 	=  -0.042752886;
betaguess4 	=  -0.040162925;
betaguess5 	=  -0.037629138;
betaguess6 	=  0.010849036;
betaguess7 	=  0.0104521132;
betaguess8 	=  -0.020267809;
betaguess9 	=  -0.021468978;
betaguess10 =  -0.02226107;
betaguess11 =  -0.023507405;

init_dim = 18;
starting_values = j(init_dim + NNNNNalt,1,0.01);
starting_values[1,1] = k; 
starting_values[2,1] = sigmax; 
starting_values[3,1] = lambdax; 
starting_values[4,1] = mu; 
starting_values[5,1] = sigmae; 
starting_values[6,1] = rnmu; 
starting_values[7,1] = rho; 
starting_values[8,1] = s_guess1; 
starting_values[9,1] = s_guess2; 
starting_values[10,1] = s_guess3; 
starting_values[11,1] = s_guess4; 
starting_values[12,1] = s_guess5; 
starting_values[13,1] = s_guess6; 
starting_values[14,1] = s_guess7; 
starting_values[15,1] = s_guess8; 


/*
fit= liklhd(starting_values);
*/


boundary = 1e300; 
nboundary = -1*1e300;
lb = j((init_dim + NNNNNalt + 2),1,nboundary);
ub = j((init_dim + NNNNNalt + 2),1,boundary);
ub[1,1] = boundary;				lb[1,1] = 0;
ub[2,1] = boundary;				lb[2,1] = 0;
ub[3,1] = boundary;				lb[3,1] = nboundary;
ub[4,1] = boundary;				lb[4,1] = nboundary;
ub[5,1] = boundary;				lb[5,1] = 0;
ub[6,1] = boundary;				lb[6,1] = nboundary;
ub[7,1] = 1;					lb[7,1] = -1;
ub[8:(NNNNNalt+7),1] = boundary;lb[8:(NNNNNalt+7),1] = 0;
bounds = t(lb||ub);

opt = {1 4 . 1};
tc = {500 15000};
par ={. 1E-8 . . . 0.01 . . . . }; 
call nlpqn(rc,psi_opt, "liklhd", starting_values, opt, bounds, tc, par,,,,);

1362 call nlpqn(rc,psi_opt, "liklhd", starting_values, opt, bounds, tc, par,,,,);
WARNING: NLPQN call: The constraint matrix (BLC) has 31 columns.
Only the first 29 columns of these are used.
ERROR: Out of memory for symbols. Cannot proceed.
NOTE: Exiting IML.
NOTE: PROCEDURE IML used (Total process time):
real time 1:24:23.37
cpu time 1:24:23.8

Rick_SAS · Posted 08-23-2019 09:50 AM

It looks to me like the objective function is returning an empty matrix because the computation is

log_L = ...;
free /	log_L;   /* frees the log_L value */
return(log_L);

You are also releasing all of your GLOBAL variables, which presumably you intended to persist between iterations.

Usually, the GLOBAL clause is for the data vectors and hyperparameters. Do not put local variables like Log_L in the GLOBAL clause.

In general, you do not need to FREE matrices in a module because the memory gets freed when the module exits.

As you develop your optimization code, you might want to keep in mind the

"Ten tips before you run an optimization."

RobertThompson · Posted 08-23-2019 10:04 AM

I changed the %macro errors to an IML do loop using call valset which seems to have fixed the memory issue. The code I changed it to is below. I do not understand why this worked. Do you have any idea? Also, I thought the free statement freed all matrices except for the ones listed after the backslash.

do i = 1 to NNNNN;
errorsqname = "s" + strip(char(i,4));
call valset(errorsqname,z[7+i]);
end;

Rick_SAS · Posted 08-23-2019 10:21 AM

Sorry. You are correct. I did not notice the slash mark, so I thought you were freeing important data.

Anyway, the FREE statement is not needed at all.

What is the value of NNNNN? I suppose if it is large enough then the number of symbols could be exhausted. But it looks like NNNNN=11, which shouldn't cause any problems. So I don't know why you were previously getting that error.

In general, matrix-vector languages like IML are more efficient when you define a single symbol (S_ that has NNNNN elements, rather than many symbols S1, S2, S3, ...., SNNNNN. Same with the beta_1 thru beta_11 variables. Using vectors enables you to use high-level operations (called BLAS operations) to perform calculations, rather than iterating over every element. If your program takes a long time to run, introducing vectorized operations can improve performance.

Tip: You don't need to set

boundary = 1e300;
nboundary = -1*1e300;

in the contraint matrix. Use missing values when a variable is unconstrainted.

RobertThompson · Posted 08-23-2019 10:29 AM

I was using the free statement in a last ditch attempt to conserve memory. Since I have removed that macro I am having no memory issues. The value of NNNNN is 11 for this dataset. Thanks for the tips, I will try to vectorize those variables. I am still confused as to why the removing macro fixed the memory usage problem. For this program I guess the problem is fixed, but if you have any ideas I am interested in finding out the root cause.

Ksharp · Posted 08-24-2019 08:26 AM

Or You might try SAS/OR .Post your optimize problem in OR forum . and @RobPratt is there .

Or You could try Genetic Algorithm in IML ?

Call NLPQN to maximize likelihood function running out of memory

Re: Call NLPQN to maximize likelihood function running out of memory

Re: Call NLPQN to maximize likelihood function running out of memory

Re: Call NLPQN to maximize likelihood function running out of memory

Re: Call NLPQN to maximize likelihood function running out of memory

Re: Call NLPQN to maximize likelihood function running out of memory

Call NLPQN to maximize likelihood function running out of memory

Re: Call NLPQN to maximize likelihood function running out of memory

Re: Call NLPQN to maximize likelihood function running out of memory

Re: Call NLPQN to maximize likelihood function running out of memory

Re: Call NLPQN to maximize likelihood function running out of memory

Re: Call NLPQN to maximize likelihood function running out of memory

Registration is open