Programming the statistical procedures from SAS

By Group Processing inside a SAS Macro involving Proc Optmodel

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 11
Accepted Solution

By Group Processing inside a SAS Macro involving Proc Optmodel

Dear All,

 

We are seeking your help to solve a bug in my SAS code that involves a SAS Macro that includes a proc optmodel.  The short description of the problem is as follows.

 

We have a SAS macro which includes a do loop and some proc optmodel statements.  The do loop is required for by-group processing. Our dataset is divided into 10 groups, indexed by the letter ‘q’.  The letter is referred to in two places inside the do loop:

 

  1. To read the dataset related to the sub group (dec&q (q = 1 to 10)
  2. To define initial values for each of 16 parameters that vary for each sub-group. For example, the parameter sigma_T1 has an initial value Sigma_T[q]; c1a has an initial value c1[q].  These initial values come from a first stage model estimation done using proc optmodel and are read in as a dataset

The error in the log file relates to item ‘b’; we get the following message when the initial values ae defined.

ERROR 525-782: The symbol 'q' is unknown.

The error message is in Line 188 of the Log file (attached).

 

Can the group help us address this error message?  Should we index the initial values differently?

A more detailed description of the program and the prblem is below.  SAS code is below. LOG file and the input data set are appended to this email.

 

 

Thanks a lot.

 

Srinivasan and Murgie

 

Detailed Statement of Problem

Preliminaries:

 

We are trying to estimate the parameters of a non-linear model that has one dependent variable, 15 predictor variables, and 17 parameters.  Of the 17 parameters, we can assume one to be equal to a sample-dependent constant. Thus, we seek to estimate 16 parameters.   

 

The model is as follows

w_ret2[i] =

(1 - rho*(sigma_T/s))*s_uepp_s[i] - ((sigma_T*sqrt(1 - (rho**2)))/sigma_Z)*s_tq2[i] - c1*w_mret2[i] - c2*w_lmcap[i] - c3*w_bm[i] -c4*w_agro[i] - c5*w_op_prof[i] - c6*sntype[i] - c7*w_lagstd[i] - c8*w_lag3vol_s[i] - c9*w_laguepp[i] - c10*w_lnumage[i] - c11*w_divy[i] - c12*w_lqprc[i] - c13*w_mom[i];

 

The index [i] refers to the row number of the dataset.

Our sample data set (attached) has 1137 rows and 16 columns.

 

The dependent variable is w_ret2.

 

The 15 predictor variables are s_uepp_s; s_tq2; w_mret2; w_lmcap; w_bm; w_agro; w_op_prof; sntype; w_lagstd; w_lag3vol_s; w_laguepp; w_lnumage; w_divy; w_lqprc;  w_mom

 

The 17 parameters are sigma_T, s, Sigma_Z, rho, and c1-c13; Of these, ‘s’ is dependent on the sample, and is not estimated.

 

Outcomes and Problem Definition.

 

We are able to estimate the problem for the entire sample of 1137 observations.

 

Our next step, for which we seek help, is to estimate the model for sub-samples.  Thus, we want to implement by-group processing in proc optmodel.

 

The structure of our sas program is as follows.

 

  1. We read the input file, fii_samp.dat, into a sas dataset fiireg
  2. We use proc rank to create a rank variable based on one of the predictor variables, w_lmcap, that numbers 1-10.
  3. We output data from the fiireg dataset into 10 datasets (labeled dec1, dec2, … dec10) based on the value of the rank variable.
  4. We write a SAS macro %optim that estimates proc optmodel for each the 10 datasets. The datasets are indexed by the alphabet ‘k.’ The macro has a do loop (indexed by the alphabet k) that reads data from each dataset, dec&k and estimates proc optmodel for that dataset. Importantly, in proc optmodel, we initialize the parameters with starting values that are user-defined and common to all 10 datasets.  We refer to this as “STAGE 1 ESTIMATION’
  5. For each of the datasets, dec&k, we output the stage 1 parameter estimates for the 16 parameters into an output dataset called ‘optdata&k’. This has a row for each subsample (dec1 – dec10) and 16 columns for the parameters.
  6. We run the macro %optim.
  7. We write a second macro %optim2 that again reads the 10 datasets as well as the parameters 16 parameters × 10 sub-samples from STAGE 1. In this second macro, the do loop is defined by the index “q.”

For convenience, in the attached code, we set the do loop to read and process only 2 sub-samples

 

Now comes the problem. We want to define the initial values of the parameters to equal the parameter estimates from Stage 1. 

 

We do so by equating each parameter as follows

Sigma_T1 = sigma_T[q]

Sigma_Z1 = sigma_Z[q]

Etc.

 

The error in the log file relates to the above definition of initial values; we get the following message when the initial values ae defined.

 

ERROR 525-782: The symbol 'q' is unknown.

 

The error message is in Line 188 of the Log file.

 

Kindly help us define initial values of the parameters in such a way that the error does not occur.

 

Thanks,

Srinivasan and Murgie

 

 

 

options ls = 132 ps = 2000 nodate nocenter;

data fiireg;
infile 'fii_samp.dat';
input ccode $10. +1 w_ret2 15.3 +1 s_uepp_s 15.3 +1 s_tq2 15.3 +1 
w_mret2 15.3 +1 w_lmcap 15.3 +1 w_bm 15.3 +1 w_agro 15.3 +1 w_op_prof 15.3 +1
sntype 15.3 +1 w_lagstd 15.3 +1 w_lag3vol_s 15.3 +1 w_laguepp 15.3 +1
w_lnumage 15.3 +1 w_divy 15.3 +1 w_lqprc 15.3 +1 w_mom 15.3;

proc rank data = fiireg out = fiirank ties = mean groups = 10;
var w_lmcap;
ranks r_lmcap;

data dec1 dec2 dec3 dec4 dec5 dec6 dec7 dec8 dec9 dec10;
set fiirank;

if r_lmcap = 0 then output dec1;
if r_lmcap = 1 then output dec2;
if r_lmcap = 2 then output dec3;
if r_lmcap = 3 then output dec4;
if r_lmcap = 4 then output dec5;
if r_lmcap = 5 then output dec6;
if r_lmcap = 6 then output dec7;
if r_lmcap = 7 then output dec8;
if r_lmcap = 8 then output dec9;
if r_lmcap = 9 then output dec10;

%macro optim;

%do k = 1 %to 2;

proc univariate noprint data = dec&k;
var s_uepp_s;
output out = desc&k std = s&k;

proc print data = desc&k;
var s&k;

proc optmodel;
set alldata;
num s&k;
num w_ret2{alldata};
num s_uepp_s{alldata};
num s_tq2{alldata};
num w_mret2{alldata};
num w_lmcap{alldata};
num w_bm{alldata};
num w_agro{alldata};
num sntype{alldata};
num w_op_prof{alldata};
num w_lagstd{alldata};
num w_lag3vol_s{alldata};
num w_laguepp{alldata};
num w_lnumage{alldata};
num w_divy{alldata};
num w_lqprc{alldata};
num w_mom{alldata};
num mycov{i in 1.._nvar_,  j in 1..i};

read data desc&k into s&k;

read data dec&k into alldata = [_n_] w_ret2 s_uepp_s s_tq2 w_mret2 w_lmcap w_bm w_agro w_op_prof sntype w_lagstd w_lag3vol_s w_laguepp w_lnumage w_divy w_lqprc w_mom;

var sigma_T init 0.03, sigma_Z init 0.03, rho init 0.4,  c1 init 0.86, c2 init -0.002, c3 init 0.0009, c4 init 0.001, c5 init -0.0005, c6 init 0.0006, c7 init 0.00005, c8 init -0.002, c9 init 0.003, c10 init -0.001, c11 init 0.0009, c12 init 0.0014, c13 init 0.0128;

impvar Err{i in alldata} = w_ret2[i] - (1 - rho*(sigma_T/s&k))*s_uepp_s[i] - ((sigma_T*sqrt(1 - (rho**2)))/sigma_Z)*s_tq2[i] - c1*w_mret2[i] - c2*w_lmcap[i] - c3*w_bm[i] -c4*w_agro[i] - c5*w_op_prof[i] - c6*sntype[i] - c7*w_lagstd[i] - c8*w_lag3vol_s[i] - c9*w_laguepp[i] - c10*w_lnumage[i] - c11*w_divy[i] - c12*w_lqprc[i] - c13*w_mom[i];

min ssq = sum{i in alldata} Err[i]^2;

performance nthreads = 1;

con a: rho >= -1 + 1e-6;
con b: sigma_T >= 1e-6;
con c: sigma_Z >= 1e-6;
con d: rho <= (1 - 1e-6);
con e: sigma_Z <= 0.01;
con f: sigma_T <= 2;

solve with nlp / tech = interiorpoint multistart msbndrange = 3 msnumstarts = 20000 covest = (cov = 5 covout = mycov) 
seed = 116894 msprintlevel = 3;

print sigma_T.msinit sigma_Z.msinit rho.msinit c1.msinit c2.msinit c3.msinit c4.msinit c5.msinit c6.msinit 
c7.msinit c8.msinit c9.msinit c10.msinit c11.msinit c12.msinit c13.msinit;

print mycov;

print sigma_T sigma_Z rho c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13;

print s&k;

create data optdata&k from sigma_T sigma_Z rho c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13;

%end;

%mend optim;

%optim;

data optdata;
set optdata1 optdata2;

proc print data = optdata;

%macro optim2;

%do q = 1 %to 2;

proc optmodel;

set alldata;
set parmdata;
num s&q;
num w_ret2{alldata};
num s_uepp_s{alldata};
num s_tq2{alldata};
num w_mret2{alldata};
num w_lmcap{alldata};
num w_bm{alldata};
num w_agro{alldata};
num sntype{alldata};
num w_op_prof{alldata};
num w_lagstd{alldata};
num w_lag3vol_s{alldata};
num w_laguepp{alldata};
num w_lnumage{alldata};
num w_divy{alldata};
num w_lqprc{alldata};
num w_mom{alldata};

num sigma_T{parmdata};
num sigma_Z{parmdata};
num rho{parmdata};
num c1{parmdata};
num c2{parmdata};
num c3{parmdata};
num c4{parmdata};
num c5{parmdata};
num c6{parmdata};
num c7{parmdata};
num c8{parmdata};
num c9{parmdata};
num c10{parmdata};
num c11{parmdata};
num c12{parmdata};
num c13{parmdata};
num chk;

num mycov{i in 1.._nvar_,  j in 1..i};

read data optdata into parmdata = [_n_] sigma_T sigma_Z rho c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13;

read data desc&q into s&q;

read data dec&q into alldata = [_n_] w_ret2 s_uepp_s s_tq2 w_mret2 w_lmcap w_bm w_agro w_op_prof sntype w_lagstd w_lag3vol_s w_laguepp w_lnumage w_divy w_lqprc w_mom;

var sigma_T1 init sigma_T[q], sigma_Z1 init sigma_Z[q], rho1 init rho[q],  c1a init c1[q], c2a init c2[q], c3a init c3[q], c4a init c4[q], c5a init c5[q], c6a init c6[q], c7a init c7[q], c8a init c8[q], c9a init c9[q], c10a init c10[q], c11a init c11[q], c12a init c12[q], c13a init c13[q];

impvar Err{i in alldata} = w_ret2[i] - (1 - rho1*(sigma_T1/s&q))*s_uepp_s[i] - ((sigma_T1*sqrt(1 - (rho1**2)))/sigma_Z1)*s_tq2[i] - c1a*w_mret2[i] - c2a*w_lmcap[i] - c3a*w_bm[i] - c4a*w_agro[i] - c5a*w_op_prof[i] - c6a*sntype[i] - c7a*w_lagstd[i] - c8a*w_lag3vol_s[i] - c9a*w_laguepp[i] - c10a*w_lnumage[i] - c11a*w_divy[i] - c12a*w_lqprc[i] - c13a*w_mom[i];

min ssq = sum{i in alldata} Err[i]^2;

performance nthreads = 1;

con a: rho1 >= -1 + 1e-6;
con b: sigma_T1 >= 1e-6;
con c: sigma_Z1 >= 1e-6;
con d: rho1 <= (1 - 1e-6);
con e: sigma_Z1 <= 0.01;
con f: sigma_T1 <= 2;

solve with nlp / tech = interiorpoint multistart msbndrange = 3 msnumstarts = 20000 covest = (cov = 5 covout = mycov) 
seed = 116894 msprintlevel = 3;

chk = sigma_T1 - sigma_T[q];

print chk;

print mycov;

print sigma_T1 sigma_Z1 rho1 c1a c2a c3a c4a c5a c6a c7a c8a c9a c10a c11a c12a c13a;

print s&q;

%end;

%mend optim2;

%optim2;

 

 

 

 


Accepted Solutions
Solution
‎03-22-2018 10:50 PM
Super User
Posts: 10,215

Re: By Group Processing inside a SAS Macro involving Proc Optmodel

[ Edited ]
Posted in reply to srinirangan123

Disclaimer: I'm no expert for optmodel at all.

In

var sigma_T1 init sigma_T[q], sigma_Z1 init sigma_Z[q], rho1 init rho[q],  c1a init c1[q], c2a init c2[q], c3a init c3[q], c4a init c4[q], c5a init c5[q], c6a init c6[q], c7a init c7[q], c8a init c8[q], c9a init c9[q], c10a init c10[q], c11a init c11[q], c12a init c12[q], c13a init c13[q]

shouldn't that be &q everywhere?

Also here:

chk = sigma_T1 - sigma_T[q];
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code

View solution in original post


All Replies
Solution
‎03-22-2018 10:50 PM
Super User
Posts: 10,215

Re: By Group Processing inside a SAS Macro involving Proc Optmodel

[ Edited ]
Posted in reply to srinirangan123

Disclaimer: I'm no expert for optmodel at all.

In

var sigma_T1 init sigma_T[q], sigma_Z1 init sigma_Z[q], rho1 init rho[q],  c1a init c1[q], c2a init c2[q], c3a init c3[q], c4a init c4[q], c5a init c5[q], c6a init c6[q], c7a init c7[q], c8a init c8[q], c9a init c9[q], c10a init c10[q], c11a init c11[q], c12a init c12[q], c13a init c13[q]

shouldn't that be &q everywhere?

Also here:

chk = sigma_T1 - sigma_T[q];
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Occasional Contributor
Posts: 11

Re: By Group Processing inside a SAS Macro involving Proc Optmodel

Posted in reply to KurtBremser

Thanks a lot Kurt.

 

I did try that, and it didn't work.  

 

But here sigma_T1 is a parameter which is fixed inside the loop and the macro.  It's initial value is sigma_T which changes as q varies.

 

Thanks again!

Occasional Contributor
Posts: 11

Re: By Group Processing inside a SAS Macro involving Proc Optmodel

Posted in reply to KurtBremser

Dear Kurt,

 

I am sorry, I did not read your email correctly.

You were right. Chnaging everything to &q, did make it work.

 

Thanks!

Srini

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 200 views
  • 0 likes
  • 2 in conversation