i have table of let's say 234,567 records (2 variables) and need to randomly select 5% of each group (only 2 groups,A and B) and assign those selected records a variable new='randomly selected'; I like to keep the selected records within the same dataset as the non selected ones
so the resulting dataset should be like :
id group new
--------------
100 A randomly selected
101 A
102 B
103 A randomly selected
104 B
......
i know how to do it in two steps but i was wondering if this can be done in one step?
Thanks,
Proc sort data=have;by group;run;
Proc surveyselect data=have out=want
samprate=5;
strata group;
run;
The resulting data set will have all of the variables in the base data and some new variables indicating selected and the probability of selection or weight.
Proc sort data=have;by group;run;
Proc surveyselect data=have out=want
samprate=5;
strata group;
run;
The resulting data set will have all of the variables in the base data and some new variables indicating selected and the probability of selection or weight.
Hi ,
You can use the starta option with all option which will create a varialbe "selected"which have value 1 if the form 5% else 0 for left 90% .
Proc sort data=have;by group;run;
Proc surveyselect data=have ALL out=want
samprate=5;
strata group;
run;
thanks guys
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.