Thank you Rick for your detailed response. Here are some of my thoughts on the difference identified: 1. You mentioned the choice of the intiial value for the lower and upper index may differ. That's right. I did started with Hahn and Meeker's guidance choosing l=floor(p*(n+1)) and u=ceil(p*(n+1)) but later on I saw the SAS theory page for calculating percentiles under The Univariate Procedure (http://support.sas.com/documentation/cdl/en/procstat/67528/HTML/default/viewer.htm#procstat_univariate_details14.htm), which states that The lower rank l and upper rank u are integers that are symmetric (or nearly symmetric) around , where is the integer part of and n is the sample size. Furthermore,landuare chosen so that and are as close to as possible while satisfying the coverage probability requirement, In order to align with what SAS does, I decided to choose floor(np)+1 as the initial index to start the incrementing/decrementing process. 2. In the example data case, yes, the SAS interval (2281,2341) did have a smaller coverage probability than the interval (2282,2342) that R program produces. Here is my puzzle. Assuming SAS implemented a more sophisticated algorithm then I don't know why the interval of (2283,2343) is not chosen since it coverage probability is closest to 0.95 compared to the other. pair of order statistics (2282,2342) with coverage probability of 0.9515646 (R function result) pair of order statistics (2281,2341) with coverage probability of 0.9514861 (SAS result). pair of order statistics (2283,2343) with coverage probability of 0.9506712 (optimal minimal coverage probability). Below is how I searched around the original symmetric CI in R: l<- 2282
u<- 2342 alpha<- 0.05
percentile<- 0.9
n<- 2568
library(tidyverse)
tab<- tibble("l_candidate"=c(l,l+1,l-1,l,l,l+1,l+1),"u_candidate"=c(u,u+1,u-1,u+1,u-1,u,u)) %>%
mutate(prob=pbinom(q = u_candidate-1,size = n, prob = percentile) - pbinom(q = l_candidate-1,size = n, prob = percentile),
logic=prob>=1-alpha) %>% filter(logic) %>% filter(prob==min(prob))
tab # A tibble: 1 x 4 l_candidate u_candidate prob logic <dbl> <dbl> <dbl> <lgl> 1 2283 2343 0.951 TRUE 3. I am sure that SAS's algorithm did something different from what is stated in the theory page. It would be greatly appreciated if you can provide any insight regarding the hidden implementation from SAS end.
... View more