turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Base SAS Programming
- /
- Beginner at simulation and want to get my feet wet

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-18-2015 11:26 AM

I am new to simulation and would like to do what I hope is a simple case.

I have a dataset with an N of 85. The data consists of the weights of truckloads of material that were weighed with scales, and data that estimates the weight based on machine performance. The estimate by truckload is interesting and can be off for an individual load by a good margin enough, but what most end users will care about is its accuracy over the course of an entire field... 80 truckloads. So it's the sum of the predictions that matters to me.

So I want to run a simulation where I randomly draw 5%, 10%, 15%, 20% etc of the loads, run a regression on those, apply it to the entire population, and see where the variability in the error between cummulative predicted mass, and the cummulative weighed mass becomes acceptable from a practical standpoint.

Is this something I can execute with do loops? Or would it be possible to do it with proc surveyselect and a do loop?

Perhaps there are some good online examples or primers out there?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-18-2015 10:49 PM

Please provide an example dataset.

PG

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-21-2015 11:15 AM - edited 12-21-2015 12:11 PM

Here is the data

day, load, the "true" weight of the load, and the machine's guess.

I want to simulate draws from this population, to make a regression and output the slope and intercept, and use those betas to estimate the mass of the sum total of the entire population. Getting the "true" weight is inconvenient, but until we can perfect the way this machine guesses the weight I'd like to figure out what reasonble rate of subsampling it would take to still get a decent estimate of the sum total.

Thanks, I'm eager to learn something about looping and macros.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-21-2015 11:31 AM

I suspect survey select and a loop are what I need... perhaps as a macro.

I've done little with loops in the past so I don't know if/how to imbed a procedure. The syntax seems a little quirky.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-21-2015 04:32 PM

Looping is not really needed for this type of investigation. You can manage with BY processing. Here is what you could do, assuming your data is in dataset **have**:

```
/* Define simulation for a single sample rate */
%macro simul(pct);
/* Generate random samples */
proc surveyselect data=have samprate=&pct. rep=100 out=sample&pct.;
run;
/* Perform a regression on each sample to predict true weight */
proc reg data=sample&pct. outest=est&pct. plots=none noprint;
by replicate;
predict: model weighed = estimated;
run;
/* Predict true weight on the whole dataset */
proc score score=est&pct. data=have out=score&pct. type=parms;
by replicate;
var estimated;
run;
/* Calculate total weights for each sample */
proc sql;
create table summ&pct. as
select
&pct. as sampRate label="Sample Rate",
replicate,
sum(predict) as totalPredicted label="Predicted Total Weight"
from score&pct.
group by replicate;
delete from weights where samprate=&pct.;
quit;
/* Accumulate results in weights dataset */
proc append base=weights data=summ&pct.; run;
%mend simul;
/* Call macro for each sample rate */
%simul(5);
%simul(10);
%simul(15);
%simul(20);
%simul(25);
%simul(30);
/* Calculate the true total weight */
proc sql noprint;
select sum(weighed) into :trueTotalWeight
from have;
quit;
/* Look at the distributions of total weight estimates,
robust measures of dispersion in particular */
proc univariate data=weights location=&trueTotalWeight. robustscale;
by samprate;
var totalPredicted;
run;
```

PG

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-21-2015 05:04 PM

Wow, a lot more than I was hoping. Thank you!!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-31-2015 12:28 PM

@PGStats I have been using Proc SQL a lot (mostly statements- "So Few Workers Go Home Ontime". I know we can use execute statment to manipulate external RDBM. But this stament is complete new to me in your above program. Can we use Data step stament in Proc SQL other than Data set options? Thanks !

`delete from weights where samprate=&pct.;`

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-31-2015 03:29 PM

There is more to standard SQL than **select** statements. Check out **delete**, **insert** and **update **statements.

SAS additions are **dateset options**, as you mentioned, **macro variable creation** (select ... into :macrovar), and **SAS functions** (including powerful date, text distance, text matching, and user defined FCMP functions).

PG