BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Dimos
Calcite | Level 5

Dear all,

I am conducting a bootstrap exercise for my research. So I have to conduct a replication for each replication. Due to model restrictions, I have to constrain the coefficient of the one of the two explanatory variables to an also simulated variable. I couldn't do that with proc reg since the restriction must be a variable defined in the afore model statement. I am trying to do it within  IML with a loop. The problem is how to define the restriction in the beta estimation. Below is my macro code. I want to set the coefficient estimate R_BETA equal to the value of Ret_var for replication.

%macro reg_loop;

%do j=1 %to 2500;

DATA Cross;

set Mcb.Cross;

if replicate=&j;

Data Variance;

set Mcb.Variance

if replicate=&j;

run;

PROC IML;

use Cross;

read all var {CF_BETA R_BETA} into X;

read all var {eret_mean} into Y;

use Variance;

real all var{Ret_var} into Restr;

/*************************************************************************************

*   I  believe I need a small ``trick'' here.                                                 *

*************************************************************************************/

n=nrow(X);

k=ncol(X);

X=J(n,1,1)||X;

C=inv(X`*X);

B_Hat=C*X`*Y;

/************************************************************************************/

SSE=y`*y-B_Hat`*X`*Y;

DFE=n-k-1;

MSE=SSE/DFE;

Mean_Y=Sum(Y)/n;

SSR=B_Hat`*X`*Y-n*Mean_Y**2;

MSR=SSR/k;

SST=SSR+SSE;

F=MSR/MSE;

SE=SQRT(vecdiag(C)#MSE);

T=B_Hat/SE;

PROBT=2*(1-CDF('T', ABS(T), DFE));

B_Hat = B_Hat`;

SE = SE`;

T = T`;

PROBT = PROBT`;

ANOVA_Table=(k||SSR||MSR||F||DFE||SSE||MSE);

STATS_Table=B_Hat||SE||T||PROBT;

create x from ANOVA_Table;

append from ANOVA_Table;

create y from STATS_Table;

append from STATS_Table;

quit;

data x;

set x;

replicate=&j;

run;

data y;

set y;

replicate=&j;

run;

proc datasets library=work;

append base=work.x1 data=work.x force;

proc datasets library=work;

append base=work.y1 data=work.y force;

quit;

%end;

%mend reg_loop;

%reg_loop;

Do you have any suggestions? Any idea will be much appreciated!

Thank you for you time and help.

Dimos

1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

I have many suggestions:

1) Get rid of the macro loop. See the article Simulation in SAS for a quick overview.  The IML language has a DO loop, so you can perform all of the looping inside SAS/IML.

2) Bootstrapping in SAS/IML is covered in Chapter the book Simulating Data with SAS. The book describes dozens of ways to make your program run faster.  If you are planning to run other simulations, this book will save you time.

3) You don't specify your version of SAS/IML, but the primary trick to bootstrapping is sampling with replacement from the data. Therefore, see how to use the SAMPLE function in SAS/IML by reading the article Sampling with replacement: Now easier than ever in the SAS/IML language - The DO Loop

All those tips will make your program more efficient.

With regard to your question of running a linear regression with a restricted parameter, I am not convinced that you can't use PROC REG and the RESTRICT statement.  Isn't the parameter value the same for all of the simulated sample?

At any rate, if you decide to solve the regression problem in SAS/IML, here's what I'd try.

The model is Y = X*beta + eps

If you want to restrict beta to some known value c, then define U to be the X matrix without the j_th column and let gamma be beta without the j_th row.  Rewrite the model as

Y = U*gamma + X[,j]*c + eps

or

Y - X[,j]*c = U*gamma + eps.

In other words, define Z = Y - X[,j]*c  and then solve the OLS model Z = U*gamma + eps to estimate the p-1 unrestricted parameters.

View solution in original post

3 REPLIES 3
Rick_SAS
SAS Super FREQ

I have many suggestions:

1) Get rid of the macro loop. See the article Simulation in SAS for a quick overview.  The IML language has a DO loop, so you can perform all of the looping inside SAS/IML.

2) Bootstrapping in SAS/IML is covered in Chapter the book Simulating Data with SAS. The book describes dozens of ways to make your program run faster.  If you are planning to run other simulations, this book will save you time.

3) You don't specify your version of SAS/IML, but the primary trick to bootstrapping is sampling with replacement from the data. Therefore, see how to use the SAMPLE function in SAS/IML by reading the article Sampling with replacement: Now easier than ever in the SAS/IML language - The DO Loop

All those tips will make your program more efficient.

With regard to your question of running a linear regression with a restricted parameter, I am not convinced that you can't use PROC REG and the RESTRICT statement.  Isn't the parameter value the same for all of the simulated sample?

At any rate, if you decide to solve the regression problem in SAS/IML, here's what I'd try.

The model is Y = X*beta + eps

If you want to restrict beta to some known value c, then define U to be the X matrix without the j_th column and let gamma be beta without the j_th row.  Rewrite the model as

Y = U*gamma + X[,j]*c + eps

or

Y - X[,j]*c = U*gamma + eps.

In other words, define Z = Y - X[,j]*c  and then solve the OLS model Z = U*gamma + eps to estimate the p-1 unrestricted parameters.

Dimos
Calcite | Level 5

Dear Rick,

Thanks a lot,

I have done the loop you proposed in another step. I just included to get a more general idea. I am bootstrapping the residuals of a VAR model and then re-generate the state variables for each replication like this Y = a*Y[t-1] + simulated_errors. Anyway, I had no problem in the simulation itself.  I have the book you propose as well as the Statistical Programming with SA/IML software and they helped me in this respect.

I am using the 9.22 version of sas. The restriction changes in each simulation, that's the problem with the restrict statement. The restrict statement accepts a variable that it is in the model statement. I haven't found a way to make it take just the value of this variable in each replication, that's why I am trying with IML. I think I got what you proposed above. I will try it and review the post accordingly.

Thanks

Dimos

Rick_SAS
SAS Super FREQ

Sounds good.  Try my formula on a simple example and compare it with the PROC REG results to see if it work. Best wishes.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

From The DO Loop
Want more? Visit our blog for more articles like these.
Discussion stats
  • 3 replies
  • 1322 views
  • 2 likes
  • 2 in conversation