BookmarkSubscribeRSS Feed
bncoxuk
Obsidian | Level 7

I took a few very tedious steps to first create a few random variables, then sort the data based on the random variables. After that, I used the data set to select the first 200 responses. As I need to get 100 random samples to test a model, I would have to repeat this process for 100 times. Very poor idea and labor work.

I am curious to learn if there is some code to automatically create 100 random samples from the orginal dataset. Something like:

%let i=100;

data work.data01 work.data02 work.data03 ... work.data&i;
  set work.fulldata; /*fulldata has 1000 observations*/
  do j=1 to 100;
  ...output to different data sets with 200 observations;
  end;
run;

:smileyconfused:

5 REPLIES 5
data_null__
Jade | Level 19

Use PROC SURVEYSELECT.  

Ksharp
Super User

Just as _null_ said.

How about:

%do i=1 %to 4;
proc surveyselect data=list_stock method=srs n=4 out=sample&i noprint;
 run;
%end;

Ksharp

bncoxuk
Obsidian | Level 7

Thanks for reply.

Can proc surveyselect accommodate the flexibility that one part of the original data sample must be selected all the time, while the other party is used for drawing random samples. To put it simply, if there is a variable called ind. If ind=1, then all observations should be selected. If ind=0, then just select 100 observations (out of the origninal 500). How can I add this to the surveyselect procedure?

data_null__
Jade | Level 19

You want to do a stratified sample where IND is the strata.  The RATE or N parameters accept a list of rates or Ns to match the number of strata.  In the example IND=0 selects 4 obs and IND=1 selects all obs.  Or you can use RATE.  Be sure to use a difference well selected seed.

The REP parameter creates,  in this example 4 independent samples, use REPLICATE in a BY statement instead of those %DOs.  Everything will be much faster and neatly contained in one data set.

data test;

   set sashelp.class(in=in1) sashelp.class;

   ind = in1;

   run;

proc sort data=test;

   by ind;

   run;

proc surveyselect rep=4 n=(4,19) /*rate=(.4,1)*/ data=test out=sample seed=443754790;

   strata ind;

   run;

proc sort data=sample;

   by replicate;

   run;

proc print data=sample;

   by replicate;

   id replicate;

   run;

Ksharp
Super User

You also can split the origin dataset into two datasets ,one contains ind=1 and the other contains ind=0,

then

%do i=1 %to 4;

proc surveyselect data=list_stock method=srs n=4 out=sample&i noprint;

run;

data sample&i;

set sample&i ind_1;

run;

%end;

Ksharp

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1105 views
  • 0 likes
  • 3 in conversation