Dear all,
I have a table that can by grouped by two variables (i.e. VARA and VARB).
How can I randomly select groups of observations by VARA and VARB rather than single random observations.
Thank you in advance
Best regards
Nikos
For example:
data test;
input varA varB x;
datalines;
1 1 1
1 1 2
1 2 3
1 2 4
2 1 5
2 1 6
2 3 7
2 3 8
;
/* Extract all unique combinations of grouping keys */
proc sort data=test out=keys nodupkeys; by varA varB; run;
/* Assign a random number to each key combination */
data ranKeys;
call streamInit(72564);
set keys;
ran = rand("UNIFORM");
keep varA varB ran;
run;
/* Order grouping key combinations by random variable */
proc sort data=ranKeys; by ran; run;
/* Select the desired fraction of key combinations (here 50%) */
data sampleKeys;
set ranKeys nobs=nobs;
if _n_ > nobs*0.5 then stop;
run;
/* Extract the data corresponding to selected key combinations */
proc sql;
create table want as
select test.*
from test natural join sampleKeys;
select * from want;
quit;
PG
Select VARA and VARB randomly
then subset the data table based on the random choices of VARA and VARB
For example:
data test;
input varA varB x;
datalines;
1 1 1
1 1 2
1 2 3
1 2 4
2 1 5
2 1 6
2 3 7
2 3 8
;
/* Extract all unique combinations of grouping keys */
proc sort data=test out=keys nodupkeys; by varA varB; run;
/* Assign a random number to each key combination */
data ranKeys;
call streamInit(72564);
set keys;
ran = rand("UNIFORM");
keep varA varB ran;
run;
/* Order grouping key combinations by random variable */
proc sort data=ranKeys; by ran; run;
/* Select the desired fraction of key combinations (here 50%) */
data sampleKeys;
set ranKeys nobs=nobs;
if _n_ > nobs*0.5 then stop;
run;
/* Extract the data corresponding to selected key combinations */
proc sql;
create table want as
select test.*
from test natural join sampleKeys;
select * from want;
quit;
PG
PG,
After you get all unique combination of group variable, I think we could use proc selectsurvey to randomly select the group ? and merge them back to get what we need ? Your thought ?
Xia Keshan
I think you can do the whole thing with surveyselect, using the CLUSTER statement. - PG
Dear PG,
Unfortunately I do not have proc selectsurvey.
Thank you
Nikos
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.