Dear all,
I have a table that can by grouped by two variables (i.e. VARA and VARB).
How can I randomly select groups of observations by VARA and VARB rather than single random observations.
Thank you in advance
Best regards
Nikos
For example:
data test;
input varA varB x;
datalines;
1 1 1
1 1 2
1 2 3
1 2 4
2 1 5
2 1 6
2 3 7
2 3 8
;
/* Extract all unique combinations of grouping keys */
proc sort data=test out=keys nodupkeys; by varA varB; run;
/* Assign a random number to each key combination */
data ranKeys;
call streamInit(72564);
set keys;
ran = rand("UNIFORM");
keep varA varB ran;
run;
/* Order grouping key combinations by random variable */
proc sort data=ranKeys; by ran; run;
/* Select the desired fraction of key combinations (here 50%) */
data sampleKeys;
set ranKeys nobs=nobs;
if _n_ > nobs*0.5 then stop;
run;
/* Extract the data corresponding to selected key combinations */
proc sql;
create table want as
select test.*
from test natural join sampleKeys;
select * from want;
quit;
PG
Select VARA and VARB randomly
then subset the data table based on the random choices of VARA and VARB
For example:
data test;
input varA varB x;
datalines;
1 1 1
1 1 2
1 2 3
1 2 4
2 1 5
2 1 6
2 3 7
2 3 8
;
/* Extract all unique combinations of grouping keys */
proc sort data=test out=keys nodupkeys; by varA varB; run;
/* Assign a random number to each key combination */
data ranKeys;
call streamInit(72564);
set keys;
ran = rand("UNIFORM");
keep varA varB ran;
run;
/* Order grouping key combinations by random variable */
proc sort data=ranKeys; by ran; run;
/* Select the desired fraction of key combinations (here 50%) */
data sampleKeys;
set ranKeys nobs=nobs;
if _n_ > nobs*0.5 then stop;
run;
/* Extract the data corresponding to selected key combinations */
proc sql;
create table want as
select test.*
from test natural join sampleKeys;
select * from want;
quit;
PG
PG,
After you get all unique combination of group variable, I think we could use proc selectsurvey to randomly select the group ? and merge them back to get what we need ? Your thought ?
Xia Keshan
I think you can do the whole thing with surveyselect, using the CLUSTER statement. - PG
Dear PG,
Unfortunately I do not have proc selectsurvey.
Thank you
Nikos
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.