09-19-2013 11:59 AM
I need to run a 15,000 iteration bootstrap on a sample that has around 11,000 observations and 260 fields. I ran a very similar program last year with the same iterations and observations, but only ~30 fields. It took around 24 hours to complete. This year, with the increased fields, it is taking unbelievably long, its been 2 days and only around 3,000 iterations have processed. I am assuming the increased number of fields is causing the extra time, but I can't be sure. So, my question is, are there options that can decrease the processing time, such as bufno= ? My research has come up mixed in terms of whether or not this is useful for so many iterations.
This is my first questions so I'm not sure how much more information to provide. I appreciate any help SAS users can provide, either specific to my question or in terms of the best way to configure SAS when running so many iterations.
09-25-2013 03:22 PM
09-25-2013 03:22 PM
09-25-2013 03:41 PM
I suspect your program works like this....
for 15,000 iterations
You will be home for dinner if you
Do you really need 15,000 samples?
09-25-2013 03:55 PM
Would you like to share what statistic you are bootstrapping? Since the computations depend on the number of variables, I am assuming some kind of multivariate statistic. Correlation? Regression? Something else?
09-26-2013 07:47 AM
Thanks so much for your response, the SUGI paper you directed me to looks very useful. I will make changes and am hopeful this will help.
Since you asked, I am modeling healthcare expenditures by disease, which involves running a probit regression to find the probability that a person's spending is greater than zero (DV=spending dummy, IV=260 disease dummies plus 20 demographic variables), running an OLS regression on people with spending (DV=log expenditures, IVs=260 disease dummies), then using the probability and OLS coefficients to determine the share of spending for each disease, for each person, for each year. I sum over diseases to find total spending by disease, find per patient expenditures, then I create an overall price index (over nine years). Obviously this is just a quick summary, but that is the gist of what I'm doing.
The bootstrap is the first part of finding bias-corrected and accelerating confidence intervals, the second step involves jackknifing.
Again, thanks so much for your response.
09-27-2013 09:09 AM
Sounds interesting. You might also want to look at the %BOOT and %JACK macros. There are also various bootstrap confidence intervals available in the %BOOTCI macro. See 24982 - Jackknife and Bootstrap Analyses
Need further help from the community? Please ask a new question.