- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Dear all,
I have a table that can by grouped by two variables (i.e. VARA and VARB).
How can I randomly select groups of observations by VARA and VARB rather than single random observations.
Thank you in advance
Best regards
Nikos
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
For example:
data test;
input varA varB x;
datalines;
1 1 1
1 1 2
1 2 3
1 2 4
2 1 5
2 1 6
2 3 7
2 3 8
;
/* Extract all unique combinations of grouping keys */
proc sort data=test out=keys nodupkeys; by varA varB; run;
/* Assign a random number to each key combination */
data ranKeys;
call streamInit(72564);
set keys;
ran = rand("UNIFORM");
keep varA varB ran;
run;
/* Order grouping key combinations by random variable */
proc sort data=ranKeys; by ran; run;
/* Select the desired fraction of key combinations (here 50%) */
data sampleKeys;
set ranKeys nobs=nobs;
if _n_ > nobs*0.5 then stop;
run;
/* Extract the data corresponding to selected key combinations */
proc sql;
create table want as
select test.*
from test natural join sampleKeys;
select * from want;
quit;
PG
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Select VARA and VARB randomly
then subset the data table based on the random choices of VARA and VARB
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
For example:
data test;
input varA varB x;
datalines;
1 1 1
1 1 2
1 2 3
1 2 4
2 1 5
2 1 6
2 3 7
2 3 8
;
/* Extract all unique combinations of grouping keys */
proc sort data=test out=keys nodupkeys; by varA varB; run;
/* Assign a random number to each key combination */
data ranKeys;
call streamInit(72564);
set keys;
ran = rand("UNIFORM");
keep varA varB ran;
run;
/* Order grouping key combinations by random variable */
proc sort data=ranKeys; by ran; run;
/* Select the desired fraction of key combinations (here 50%) */
data sampleKeys;
set ranKeys nobs=nobs;
if _n_ > nobs*0.5 then stop;
run;
/* Extract the data corresponding to selected key combinations */
proc sql;
create table want as
select test.*
from test natural join sampleKeys;
select * from want;
quit;
PG
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
PG,
After you get all unique combination of group variable, I think we could use proc selectsurvey to randomly select the group ? and merge them back to get what we need ? Your thought ?
Xia Keshan
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I think you can do the whole thing with surveyselect, using the CLUSTER statement. - PG
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Dear PG,
Unfortunately I do not have proc selectsurvey.
Thank you
Nikos