Hi experts,
I like to obtain the adjusted empirical CDF and density based on BCA interval. I understand the density and CDF based on BCA interval can be obtained by weighting the original bootstrap estimates. Any help is appreciated.
Thanks in advance.
There are approximately 10 bazillion statistical topics that @Rick_SAS has written about, good chance one of them is what you want. Here it is: https://blogs.sas.com/content/iml/2017/07/12/bootstrap-bca-interval.html
Thanks @PaigeMiller ! I read @Rick_SAS blog before and have since used it to compute BCA confidence interval. But I am not trying to compute the BCA confidence interval. I am trying to obtain the empirical CDF and density--not just two endpoint of the BCA confidence interval.
Have you looked at PROC KDE?
I am confused. Are you trying to compute the empirical CDF and estimate the density for DATA? If so, you do not need bootstrapping, and I don't know understand what you are using the BCa interval for. Bootstrapping is for when you want to estimate the sampling distribution of a STATISTIC, and the BCa is an interval estimate for the PARAMETER that the statistic estimates.
IF YOU WANT THE CDF AND DENSITY OF DATA:
A histogram is an estimate of a density function. So is a kernel density estimate. You can estimate a CDF from data by using:
IF YOU WANT TO ESTIMATE A BOOTSTRAP DISTRIBUTION:
Many thanks @Rick_SAS @PaigeMiller .
The lower and upper limits of the BCA confidence interval are two points from an adjusted sampling distribution of the estimate. The adjustment is obtained through weighting of the sampling distribution to account for bias and skewness. Much like the computation of the BCA confidence interval that uses the bias correction factor and the acceleration factor, adjustment of the sampling distribution uses these two parameters. So the adjusted sampling distribution has eCDF and density. There are formula for these. The adjusted density can be obtained using the R package BCABOOT by specifying cd=1 when calling the function. Being a SAS lover, I'll rather use SAS if it can be done in SAS.
Thanks
If I am correctly understanding your request, you need to use the double bootstrap.
For each bootstrap with B resamples, you get a value for the bias-correction factor and the acceleration factor that is used in the BCa interval. It sounds like you want to see how these values vary across similar bootstrap analyses, To estimate the sampling variability of these parameters, you would conduct C bootstrap analyses, each contains B resamples, and then look at the bootstrap distribution of the C estimates for the bias-correction and the acceleration.
Many thanks @Rick_SAS.
Let's say we are interested in computing the BCA confidence interval for the mean difference. The BCA will give confidence interval of the mean difference based on adjusted percentiles. I am not interested in the sampling distribution of the bias correction factor and acceleration factor. I am interested in the CDF or density of the adjusted mean difference such that if we were to plot it, two points on this density will correspond to the BCA confidence interval. The implementation in R is based on the formula from Bradley Efron (page 15 of the JSM 2016 slides on his website). Bradley Efron called this weighting w(theta) on theta.
Thanks!!
If you have a link, please post it. We will all save a lot of time if you post all the relevant information that explains the problem and how to solve it.
Thanks @Rick_SAS for your reply. I had to search for human-readable version of the concept that makes the concept clearer instead of a presentation slide which a summary. Hence, the delay in posting.
Bayesian inference and the parametric bootstrap (arxiv.org)
Above is a link to paper on BCA density based on weighted bootstrap distribution. To save you the time, the formula for BCA density weighting is in Equation 2.17. Then after the weighted density in equation 2.17, the CDF is in equation 2.15.
Below is implementation in BCABOOT package in R. As one may observe, the implementation is consistent with Equation 2.17 in the Efron paper in the link above.
vl <- list(call = call, lims = lims, stats = stats, ustats = ust0, seed = seed) if (length(trun) > 1) vl$amat <- vl0$amat if (cd == 1) { a <- stats[1, 4] z0 <- stats[1, 5] G <- (rank(tt) - 0.5)/B zth <- stats::qnorm(G) - z0 az <- 1 + a * zth num <- stats::dnorm(zth/az - z0) den <- az^2 * stats::dnorm(zth + z0) w <- num/den vl$w <- w }
Thanks!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.