Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 08-04-2021 11:39 AM
(1077 views)

Hello

I am trying to create a bootstrap confidence interval in order to find the coverage and the length of it.

First I created my data as usual, I estimated the parameters (by using call nlpnra) then I use these MLEs to simulate new data, and so on. I am having trouble in two areas. The first one is in the output....they are HORRIBLE and contradict the whole idea of bootstrapping. Although I get results, however, the log file is full of errors!!!!

My last question: I am planning to get my MLEs for the two parameters, save each as a vector then find mean and variance for each and use them to create CI.

Is there a shorter way or this is the only way to do it?

Thank you

Thank you

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

It looks to me like you are repeated the simulation N times, which is called a double-bootstrap. Each iteration of the loop

Do J=1 to N;

...

End;

is a simulation. The bias is the difference between the Monte Carlo mean and the parameter for the simulation. Again, the parameter for your simulation is (alpha_mle, lambda_mle), NOT (alpha, lambda).

The double-bootstrap is notoriously expensive. I guess you are using it to estimate the variance of the bias?

Nevertheless, I can run your program with

N=100; B=100;

in about 11 seconds on a PC from 2015, so I don't understand why it takes a long time for you. There are a few improvements you can make, most notably decreasing the number of iterations for each call to NLPNRA by using a better initial guess. I've made one or two other minor improvements and attached the program that runs in 11 seconds. Let me know how long this takes on your "very fast machine".

```
%macro ODSOff();
ods graphics off;
ods exclude all;
ods noresults;
options NONOTES;
%mend;
%macro ODSOn();
ods graphics on;
ods exclude none;
ods results;
options NOTES;
%mend;
%odsoff;
proc iml;
t0 = time();
seed=12345; N=100; B=100;
m=20;
alpha =2; Lambda=2; a=0.05; *a represents the significance level;
** Data simulation using progressive first failure **;
*****************************************************;
start Data(alpha,lambda) global(seed,m);
U=ranuni(repeat(seed,m,1));
X=( 1-(1-U)##(1/alpha) )##(1/lambda);
return(X);
finish;
*******************************************************;
start MLE_func(y) global(X);
m=nrow(x);
func=0;
alpha=y[1];
lambda=y[2];
Sum_log=J(m,1,0);
Sum_log=log(x);
Sum_log_1=J(m,1,0);
Sum_log_1=log(1-X##lambda);
func=func + m*log(alpha*lambda)+(lambda-1)* Sum_log[+] + (alpha-1) * Sum_log_1[+] ;
Return(func);
finish;
********************************************************;
BLimit_L_alpha1=J(N,1,0); BLimit_U_alpha1=J(N,1,0);
B_Length_alpha1=J(N,1,0); B_COV_alpha1=J(N,1,0);
B_alpha1_Bias=J(N,1,0);
Do J=1 to N;
con = {1e-6 1e-6, . .};
x0_MLE= {0.05, 0.05};
optn = {2 0};
tc={10000 14000};
********************************************;
*** Bootstrap Steps ***;
********************************************;
Step1:
X=Data(alpha,lambda);
call nlpnra(rc, MLE_ret, "MLE_func", x0_MLE, optn, con,tc);
alpha_mle =MLE_ret[1];
lambda_mle =MLE_ret[2];
*****************************************************************;
Step2:
B_alpha1 = J(B,1,0);
B_lambda1 = J(B,1,0);
x0_MLE = alpha_mle || lambda_mle;
Do i=1 to B;
X=Data(alpha_mle,lambda_mle);
call nlpnra(rc, MLE_ret, "MLE_func", x0_MLE, optn, con,tc);
B_alpha1[i]=MLE_ret[1];
B_lambda1[i]=MLE_ret[2];
end;
****** Bootstrap Bias *****;
*************************************************;
B_alpha1_Bias[j]= ABS( B_alpha1[:]- alpha );
*****************************************************;
L=ceil(B*a/2); U=ceil(B*(1-a/2)); /*round up */
Call sort(B_alpha1, 1); /** sort B_alpha_mle by 1st col **/
BLimit_L_alpha1[j]=( B_alpha1[L]+B_alpha1[L+1] )/2;
BLimit_U_alpha1[j]=( B_alpha1[U]+B_alpha1[U+1] )/2;
B_Length_alpha1[J]=BLimit_U_alpha1[j] - BLimit_L_alpha1[j]; ** length of the C.I. for OVL1=rho;
If ( (alpha >= BLimit_L_alpha1[j])& (alpha <= BLimit_U_alpha1[j] ) ) then
B_COV_alpha1[J]=1;* COV is the coverage;
end;
%odson
***************** END Of Simulations ************************;
********************************************************************************************************;
Biasalpha1=B_alpha1_Bias[:];
B_length_alpha1=B_Length_alpha1[:];
B_coverage_alpha1=B_COV_alpha1[:];
print Biasalpha1 B_length_alpha1 B_coverage_alpha1;
totalTime = time() - t0;
print totalTime;
Quit;
```

9 REPLIES 9

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thank you so much. You are right. It seems that I didn't pay attention to my tab key. 😊

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

First, do you want to do a bootstrap computation or run a simulation study from a model?

In a bootstrap computation, you resample from the data and refit the model on each resample. Each sample is a random draw, with replacement, from the data. You estimate the parameters for each resample.

In a simulation study, you fit the model to the data. This gives you ONE set of parameters. You then draw random samples from the model and estimate the parameters for each random sample. This is sometimes called the "parametric bootstrap," which is why it is important to understand what you are trying to accomplish.

Your program seems to be the parametric bootstrap (simulation), although the initial data are simulated. Perhaps this is just so that we can run your example?

Regarding the "horrible" results, your program looks reasonable to me, except that you are incorrectly comparing the Monte Carlo means with the original (alpha, lambda) values. These are not the values that you are bootstrapping. You are bootstrapping the parameter values (alpha_mle, lambda_mle)=(3.955, 2.658), which are the parameter estimates for the simulated data. If you change B=500 and run the simulation, you will see that these values are near the center of the bootstrap distribution of estimates:**call scatter(B_alpha_MLE, B_lambda_MLE) other="refline 3.955/axis=x; refline 2.658/axis=y;";**

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

You are absolutely right in your assumptions. This is exactly what I need to do.

However, I have a problem when I increased the number of simulations to even 10 and B=100, it takes forever and I had to force it to finish even though I have a very fast machine. I am facing this even with one method for one parameter and I still have 4 more estimation methods for two parameters. and they all involve solving systems of equations.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

It looks to me like you are repeated the simulation N times, which is called a double-bootstrap. Each iteration of the loop

Do J=1 to N;

...

End;

is a simulation. The bias is the difference between the Monte Carlo mean and the parameter for the simulation. Again, the parameter for your simulation is (alpha_mle, lambda_mle), NOT (alpha, lambda).

The double-bootstrap is notoriously expensive. I guess you are using it to estimate the variance of the bias?

Nevertheless, I can run your program with

N=100; B=100;

in about 11 seconds on a PC from 2015, so I don't understand why it takes a long time for you. There are a few improvements you can make, most notably decreasing the number of iterations for each call to NLPNRA by using a better initial guess. I've made one or two other minor improvements and attached the program that runs in 11 seconds. Let me know how long this takes on your "very fast machine".

```
%macro ODSOff();
ods graphics off;
ods exclude all;
ods noresults;
options NONOTES;
%mend;
%macro ODSOn();
ods graphics on;
ods exclude none;
ods results;
options NOTES;
%mend;
%odsoff;
proc iml;
t0 = time();
seed=12345; N=100; B=100;
m=20;
alpha =2; Lambda=2; a=0.05; *a represents the significance level;
** Data simulation using progressive first failure **;
*****************************************************;
start Data(alpha,lambda) global(seed,m);
U=ranuni(repeat(seed,m,1));
X=( 1-(1-U)##(1/alpha) )##(1/lambda);
return(X);
finish;
*******************************************************;
start MLE_func(y) global(X);
m=nrow(x);
func=0;
alpha=y[1];
lambda=y[2];
Sum_log=J(m,1,0);
Sum_log=log(x);
Sum_log_1=J(m,1,0);
Sum_log_1=log(1-X##lambda);
func=func + m*log(alpha*lambda)+(lambda-1)* Sum_log[+] + (alpha-1) * Sum_log_1[+] ;
Return(func);
finish;
********************************************************;
BLimit_L_alpha1=J(N,1,0); BLimit_U_alpha1=J(N,1,0);
B_Length_alpha1=J(N,1,0); B_COV_alpha1=J(N,1,0);
B_alpha1_Bias=J(N,1,0);
Do J=1 to N;
con = {1e-6 1e-6, . .};
x0_MLE= {0.05, 0.05};
optn = {2 0};
tc={10000 14000};
********************************************;
*** Bootstrap Steps ***;
********************************************;
Step1:
X=Data(alpha,lambda);
call nlpnra(rc, MLE_ret, "MLE_func", x0_MLE, optn, con,tc);
alpha_mle =MLE_ret[1];
lambda_mle =MLE_ret[2];
*****************************************************************;
Step2:
B_alpha1 = J(B,1,0);
B_lambda1 = J(B,1,0);
x0_MLE = alpha_mle || lambda_mle;
Do i=1 to B;
X=Data(alpha_mle,lambda_mle);
call nlpnra(rc, MLE_ret, "MLE_func", x0_MLE, optn, con,tc);
B_alpha1[i]=MLE_ret[1];
B_lambda1[i]=MLE_ret[2];
end;
****** Bootstrap Bias *****;
*************************************************;
B_alpha1_Bias[j]= ABS( B_alpha1[:]- alpha );
*****************************************************;
L=ceil(B*a/2); U=ceil(B*(1-a/2)); /*round up */
Call sort(B_alpha1, 1); /** sort B_alpha_mle by 1st col **/
BLimit_L_alpha1[j]=( B_alpha1[L]+B_alpha1[L+1] )/2;
BLimit_U_alpha1[j]=( B_alpha1[U]+B_alpha1[U+1] )/2;
B_Length_alpha1[J]=BLimit_U_alpha1[j] - BLimit_L_alpha1[j]; ** length of the C.I. for OVL1=rho;
If ( (alpha >= BLimit_L_alpha1[j])& (alpha <= BLimit_U_alpha1[j] ) ) then
B_COV_alpha1[J]=1;* COV is the coverage;
end;
%odson
***************** END Of Simulations ************************;
********************************************************************************************************;
Biasalpha1=B_alpha1_Bias[:];
B_length_alpha1=B_Length_alpha1[:];
B_coverage_alpha1=B_COV_alpha1[:];
print Biasalpha1 B_length_alpha1 B_coverage_alpha1;
totalTime = time() - t0;
print totalTime;
Quit;
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thank you so much Rick,

Regarding what you said (Again, the parameter for your simulation is (alpha_mle, lambda_mle), NOT (alpha, lambda).) You are absolutely right and make all the sense. It was just that I was focusing on the bootstrap and the many steps which I need to add and it slipped from my mind.

I ran the code in my machine and it took it 5 seconds this is sooooo amazing really ...Thank you so much.

The initial point was a nother thing that I totally ignored so I am grateful for pointing it out to me.

Thanks again

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I followed what you said in your reply, which makes sense.

However, when I ran my code for a small sample (20 observations) and for only B=100, the bias came out to be zero am I doing it wrong? Please advise.

Thank you

%macro ODSOff();

ods graphics off;

ods exclude all;

ods noresults;

options NONOTES;

%mend;

%macro ODSOn();

ods graphics on;

ods exclude none;

ods results;

options NOTES;

%mend;

%odsoff;

proc iml;

t0 = time();

seed=12345; N=1000; B=100;

m=20;

alpha =1; Lambda=2; a=0.05; *a represents the significance level;

** Data simulation using progressive first failure **;

*****************************************************;

start Data(alpha,lambda) global(seed,m);

U=ranuni(repeat(seed,m,1));

X=( 1-(1-U)##(1/alpha) )##(1/lambda);

return(X);

finish;

*******************************************************;

start MLE_func(y) global(X);

m=nrow(x);

func=0;

alpha=y[1];

lambda=y[2];

Sum_log=J(m,1,0);

Sum_log=log(x);

Sum_log_1=J(m,1,0);

Sum_log_1=log(1-X##lambda);

func=func + m*log(alpha*lambda)+(lambda-1)* Sum_log[+] + (alpha-1) * Sum_log_1[+] ;

Return(func);

finish;

*** Solve for MOM: by solving system of equations ***;

*******************************************************;

start MOM_Fun(var) global (X,X1,X2);

X_bar=X[:];

Xsqr_bar=(X##2)[:];

alpha = var[1]; lambda = var[2];

f = j(1, 2, .);

f[1] = log(X_bar) -

( lgamma(alpha+1) + lgamma(1/lambda+1) - lgamma(alpha+1/lambda+1));

f[2] = log(Xsqr_bar) -

(log(alpha) + lgamma(alpha) + lgamma(2/lambda+1) - lgamma(alpha+2/lambda+1));

return (f);

finish;

********************************************************;

** alpha1= alpha_mle; *alpha2= alpha_mom;

BLimit_L_alpha1=J(N,1,0); BLimit_U_alpha1=J(N,1,0);

B_Length_alpha1=J(N,1,0); B_COV_alpha1=J(N,1,0);

B_alpha1_Bias=J(N,1,0);

BLimit_L_Lambda1=J(N,1,0); BLimit_U_Lambda1=J(N,1,0);

B_Length_Lambda1=J(N,1,0); B_COV_Lambda1=J(N,1,0);

B_Lambda1_Bias=J(N,1,0);

**********************************************************;

**alpha2= alpha_mom;

BLimit_L_alpha2=J(N,1,0); BLimit_U_alpha2=J(N,1,0);

B_Length_alpha2=J(N,1,0); B_COV_alpha2=J(N,1,0);

B_alpha2_Bias=J(N,1,0);

BLimit_L_Lambda2=J(N,1,0); BLimit_U_Lambda2=J(N,1,0);

B_Length_Lambda2=J(N,1,0); B_COV_Lambda2=J(N,1,0);

B_Lambda2_Bias=J(N,1,0);

*********************************************************;

Do J=1 to N;

con = {1e-6 1e-6, . .};

x0_MLE= {0.05, 0.05};

x0_mom = {0.5 0.8};

optn = {2 0};

tc={10000 14000};

********************************************;

*** Bootstrap Steps ***;

********************************************;

Step1:

X=Data(alpha,lambda);

call nlpnra(rc, MLE_ret, "MLE_func", x0_MLE, optn, con,tc);

alpha_mle =MLE_ret[1];

lambda_mle =MLE_ret[2];

************************* Find MOM ************************;

call NLPLM(rc, MOM_ret, "MOM_Fun", x0_mom, optn) blc=con; /* or use NLPLM */

alpha_mom =MOM_ret[1];

lambda_mom =MOM_ret[2];

*****************************************************************;

Step2:

B_alpha1 = J(B,1,0); B_Lambda1 = J(B,1,0);

B_alpha2 = J(B,1,0); B_Lambda2 = J(B,1,0);

Do i=1 to B;

X1=Data(alpha_mle,lambda_mle);

x0_MLE = MLE_ret[1] || MLE_ret[2];

call nlpnra(rc, MLE_ret, "MLE_func", x0_MLE, optn, con,tc);

B_alpha1[i]=MLE_ret[1];

B_lambda1[i]=MLE_ret[2];

X2=Data(alpha_mom,lambda_mom);

x0_mom = mom_ret[1] || mom_ret[2];

call NLPLM(rc, mom_ret, "MOM_Fun", x0_mom, optn) blc=con;

B_alpha2[i] =mom_ret[1];

B_lambda2[i] =mom_ret[2];

end;

****** Bootstrap Bias *****;

*************************************************;

* Rick: The bias is the difference between the Monte Carlo mean and the parameter for the simulation.

Again, the parameter for your simulation is (alpha_mle, lambda_mle), NOT (alpha, lambda).;

B_alpha1_Bias[j]= ABS( B_alpha1[:]- alpha_mle ); B_Lambda1_Bias[j]= ABS( B_Lambda1[:]- Lambda_mle );

B_alpha2_Bias[j]= ABS( B_alpha2[:]- alpha_mom ); B_Lambda2_Bias[j]= ABS( B_Lambda2[:]- Lambda_mom );

end;

%odson

***************** END Of Simulations ************************;

********************************************************************************************************;

Biasalpha1=B_alpha1_Bias[:]; BiasLambda1=B_Lambda1_Bias[:];

Biasalpha2=B_alpha2_Bias[:]; BiasLambda2=B_Lambda2_Bias[:];

print m N B;

print Biasalpha1 BiasLambda1;

print Biasalpha2 BiasLambda2;

totalTime = time() - t0;

print totalTime;

Quit;

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I enjoy helping people solve problems with SAS, but I usually avoid helping someone debug a long complicated problem. I don't usually have the time to debug other people's programs. And, if you are skillful enough to write a complicated program, I believe it is important that you develop your debugging skills to match your programming skills.

Here's one last bit of help: read the article "Ten tips before you run an optimization".

For your problem, you need to perform Tips #1 and #6.

In particular, your modules are using the GLOBAL variable X, but your program is generating random data in X1 and X2. So every step is solving the same optimization problem.

Good luck and best wishes in the future.

**SAS Innovate 2025** is scheduled for May 6-9 in Orlando, FL. Sign up to be **first to learn** about the agenda and registration!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.