Solved
Contributor
Posts: 68

Select random groups of observations rather unique observations

Dear all,

I have a table that can by grouped by two variables (i.e. VARA and VARB).

How can I randomly select groups of observations by VARA and VARB rather than single random observations.

Best regards

Nikos

Accepted Solutions
Solution
‎05-02-2014 10:02 PM
Posts: 5,539

Re: Select random groups of observations rather unique observations

For example:

data test;

input varA varB x;

datalines;

1 1 1

1 1 2

1 2 3

1 2 4

2 1 5

2 1 6

2 3 7

2 3 8

;

/* Extract all unique combinations of grouping keys */

proc sort data=test out=keys nodupkeys; by varA varB; run;

/* Assign a random number to each key combination */

data ranKeys;

call streamInit(72564);

set keys;

ran = rand("UNIFORM");

keep varA varB ran;

run;

/* Order grouping key combinations by random variable */

proc sort data=ranKeys; by ran; run;

/* Select the desired fraction of key combinations (here 50%) */

data sampleKeys;

set ranKeys nobs=nobs;

if _n_ > nobs*0.5 then stop;

run;

/* Extract the data corresponding to selected key combinations */

proc sql;

create table want as

select test.*

from test natural join sampleKeys;

select * from want;

quit;

PG

PG

All Replies
Posts: 3,054

Re: Select random groups of observations rather unique observations

Select VARA and VARB randomly

then subset the data table based on the random choices of VARA and VARB

--
Paige Miller
Solution
‎05-02-2014 10:02 PM
Posts: 5,539

Re: Select random groups of observations rather unique observations

For example:

data test;

input varA varB x;

datalines;

1 1 1

1 1 2

1 2 3

1 2 4

2 1 5

2 1 6

2 3 7

2 3 8

;

/* Extract all unique combinations of grouping keys */

proc sort data=test out=keys nodupkeys; by varA varB; run;

/* Assign a random number to each key combination */

data ranKeys;

call streamInit(72564);

set keys;

ran = rand("UNIFORM");

keep varA varB ran;

run;

/* Order grouping key combinations by random variable */

proc sort data=ranKeys; by ran; run;

/* Select the desired fraction of key combinations (here 50%) */

data sampleKeys;

set ranKeys nobs=nobs;

if _n_ > nobs*0.5 then stop;

run;

/* Extract the data corresponding to selected key combinations */

proc sql;

create table want as

select test.*

from test natural join sampleKeys;

select * from want;

quit;

PG

PG
Super User
Posts: 10,784

Re: Select random groups of observations rather unique observations

PG,

After you get all unique combination of group variable, I think we could use proc selectsurvey to randomly select the group ? and merge them back to get what we need ? Your thought ?

Xia Keshan

Posts: 5,539

Re: Select random groups of observations rather unique observations

I think you can do the whole thing with surveyselect, using the CLUSTER statement. - PG

PG
Contributor
Posts: 68

Re: Select random groups of observations rather unique observations

Dear PG,

Unfortunately I do not have proc selectsurvey.

Thank you

Nikos

🔒 This topic is solved and locked.