turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Learn SAS
- /
- Analytics U
- /
- Beginner at Simulation

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-18-2015 11:04 AM

I am new to simulation and would like to do what I hope is a simple case.

I have a dataset with an N of 85. The data consists of the weights of truckloads of material that were weighed with scales, and data that estimates the weight based on machine performance. The estimate by truckload is interesting and can be off for an individual load by a good margin enough, but what most end users will care about is its accuracy over the course of an entire field... 80 truckloads. So it's the sum of the predictions that matters to me.

So I want to run a simulation where I randomly draw 5%, 10%, 15%, 20% etc of the loads, run a regression on those, apply it to the entire population, and see where the variability in the error between cummulative predicted mass, and the cummulative weighed mass becomes acceptable from a practical standpoint.

Is this something I can execute with do loops? Or would it be possible to do it with proc surveyselect and a do loop?

Is there a good primer out there?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-18-2015 11:15 AM

Here are two good references, one is paid, a book from Rick Wicklin and the other is a paper on Don't be loopy that covers simulation fairly well. If you post your stats related questions in the Statistical Forum, Rick usually participates there as well.

http://www.sas.com/store/prodBK_65378_en.html

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-24-2015 07:24 AM

In a simulation, you start with a model and you simulate data from the model. It sounds like this is a bootstrap problem, not a simulation problem, because you talk about choosing 5%, 10%, etc, of real data.

If this is real data, you can use SURVEYSELECT to extract the data. Definitely read Cassell's paper or my article on sampling with replacement in SAS.