12-14-2017 06:17 PM
I have a dataset called "boston" which has 100 observations, I'd like to create 2 datasets of 100 observations by sampling with replacement my original dataset "boston". I am using this code:
%let rep = 2;
proc surveyselect data = boston out = resample;
seed = 1347 method = urs
samprate = 1 outhits rep = &rep;
ods listing close;
This creates a dataset called "resample" which has a variable called "replicate" (= 1 or 2) which identifies my 100 observations for each of my 2 samplings. However, I would like to output 2 datasets each with its own sampling of 100 observations, such as resample1 and resample2. How can I do that?
Thanks very much!
12-14-2017 06:37 PM - edited 12-14-2017 06:41 PM
this might get you started:
You don't want to set a seed if you want different samples.
SAMPSIZE may be more reliable than SAMPRATE if you want a specific number of resultant selections.
Here's one way with an example call with a data set you should have to see if it is working correctly.
The reps and size are defaults that will be used if not supplied with the call. The indataset must exist, reps cannot be set to less than 1, size should be an integer > 0.
%macro resample (indataset=, outdata=, reps=2, size=100); %do i=1 %to &reps; proc surveyselect data = &indataset. out = &outdata.&i. noprint method = urs sampsize = &size outhits ; run; %end; %mend; %resample (indataset=sashelp.class, outdata=work.resample, reps=2,size=5);
if you need this more flexible, such as the method you can parameters following this pattern but too many will likely complicate the code trying to get interactions straight.
Did you examine an output set with rep=2 to make sure it looked correct? The values of rep would likely not meet your want and the number of records is another issue.
12-15-2017 10:56 AM
Although you can do this, it's generally not a good idea. Because then to process things further you need to then use a macro to have it run over each data set versus just using a BY statement in your procedure.
If you're doing bootstrap or simulation this may be worth reading, it goes over how to simulate data in SAS and why you don't want to do it this way, though it covers both approaches.