BookmarkSubscribeRSS Feed
bncoxuk
Obsidian | Level 7

I took a few very tedious steps to first create a few random variables, then sort the data based on the random variables. After that, I used the data set to select the first 200 responses. As I need to get 100 random samples to test a model, I would have to repeat this process for 100 times. Very poor idea and labor work.

I am curious to learn if there is some code to automatically create 100 random samples from the orginal dataset. Something like:

%let i=100;

data work.data01 work.data02 work.data03 ... work.data&i;
  set work.fulldata; /*fulldata has 1000 observations*/
  do j=1 to 100;
  ...output to different data sets with 200 observations;
  end;
run;

:smileyconfused:

5 REPLIES 5
data_null__
Jade | Level 19

Use PROC SURVEYSELECT.  

Ksharp
Super User

Just as _null_ said.

How about:

%do i=1 %to 4;
proc surveyselect data=list_stock method=srs n=4 out=sample&i noprint;
 run;
%end;

Ksharp

bncoxuk
Obsidian | Level 7

Thanks for reply.

Can proc surveyselect accommodate the flexibility that one part of the original data sample must be selected all the time, while the other party is used for drawing random samples. To put it simply, if there is a variable called ind. If ind=1, then all observations should be selected. If ind=0, then just select 100 observations (out of the origninal 500). How can I add this to the surveyselect procedure?

data_null__
Jade | Level 19

You want to do a stratified sample where IND is the strata.  The RATE or N parameters accept a list of rates or Ns to match the number of strata.  In the example IND=0 selects 4 obs and IND=1 selects all obs.  Or you can use RATE.  Be sure to use a difference well selected seed.

The REP parameter creates,  in this example 4 independent samples, use REPLICATE in a BY statement instead of those %DOs.  Everything will be much faster and neatly contained in one data set.

data test;

   set sashelp.class(in=in1) sashelp.class;

   ind = in1;

   run;

proc sort data=test;

   by ind;

   run;

proc surveyselect rep=4 n=(4,19) /*rate=(.4,1)*/ data=test out=sample seed=443754790;

   strata ind;

   run;

proc sort data=sample;

   by replicate;

   run;

proc print data=sample;

   by replicate;

   id replicate;

   run;

Ksharp
Super User

You also can split the origin dataset into two datasets ,one contains ind=1 and the other contains ind=0,

then

%do i=1 %to 4;

proc surveyselect data=list_stock method=srs n=4 out=sample&i noprint;

run;

data sample&i;

set sample&i ind_1;

run;

%end;

Ksharp

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 780 views
  • 0 likes
  • 3 in conversation