Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- General Programming
- /
- Configuring SAS for optimal processing during 15k ...

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-19-2013 11:59 AM

Hello,

I need to run a 15,000 iteration bootstrap on a sample that has around 11,000 observations and 260 fields. I ran a very similar program last year with the same iterations and observations, but only ~30 fields. It took around 24 hours to complete. This year, with the increased fields, it is taking unbelievably long, its been 2 days and only around 3,000 iterations have processed. I am assuming the increased number of fields is causing the extra time, but I can't be sure. So, my question is, are there options that can decrease the processing time, such as bufno= ? My research has come up mixed in terms of whether or not this is useful for so many iterations.

This is my first questions so I'm not sure how much more information to provide. I appreciate any help SAS users can provide, either specific to my question or in terms of the best way to configure SAS when running so many iterations.

Thanks!

Accepted Solutions

Solution

09-25-2013
03:22 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Tdot

09-25-2013 03:22 PM

*Simulating Data with SAS* describes how to bootstrap efficiently in SAS. If you don't have a copy of my book, then see Cassell (2007), which uses many of the same ideas.

All Replies

Solution

09-25-2013
03:22 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Tdot

09-25-2013 03:22 PM

*Simulating Data with SAS* describes how to bootstrap efficiently in SAS. If you don't have a copy of my book, then see Cassell (2007), which uses many of the same ideas.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Tdot

09-25-2013 03:41 PM

I suspect your program works like this....

for 15,000 iterations

- take a sample
- compute compute compute
- add results of this iteration to a data set some where.

end

You will be home for dinner if you

- use SURVEYSELECT to make sample data with 15,000 samples.
- compute compute compute BY SAMPLE.

Do you really need 15,000 samples?

Show your work and we can help you fix it while you wait on books to arrive.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Tdot

09-25-2013 03:55 PM

Would you like to share what statistic you are bootstrapping? Since the computations depend on the number of variables, I am assuming some kind of multivariate statistic. Correlation? Regression? Something else?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rick_SAS

09-26-2013 07:47 AM

Hi Rick,

Thanks so much for your response, the SUGI paper you directed me to looks very useful. I will make changes and am hopeful this will help.

Since you asked, I am modeling healthcare expenditures by disease, which involves running a probit regression to find the probability that a person's spending is greater than zero (DV=spending dummy, IV=260 disease dummies plus 20 demographic variables), running an OLS regression on people with spending (DV=log expenditures, IVs=260 disease dummies), then using the probability and OLS coefficients to determine the share of spending for each disease, for each person, for each year. I sum over diseases to find total spending by disease, find per patient expenditures, then I create an overall price index (over nine years). Obviously this is just a quick summary, but that is the gist of what I'm doing.

The bootstrap is the first part of finding bias-corrected and accelerating confidence intervals, the second step involves jackknifing.

Again, thanks so much for your response.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Tdot

09-27-2013 09:09 AM

Sounds interesting. You might also want to look at the %BOOT and %JACK macros. There are also various bootstrap confidence intervals available in the %BOOTCI macro. See 24982 - Jackknife and Bootstrap Analyses