i have table of let's say 234,567 records (2 variables) and need to randomly select 5% of each group (only 2 groups,A and B) and assign those selected records a variable new='randomly selected'; I like to keep the selected records within the same dataset as the non selected ones
so the resulting dataset should be like :
id group new
--------------
100 A randomly selected
101 A
102 B
103 A randomly selected
104 B
......
i know how to do it in two steps but i was wondering if this can be done in one step?
Thanks,
Proc sort data=have;by group;run;
Proc surveyselect data=have out=want
samprate=5;
strata group;
run;
The resulting data set will have all of the variables in the base data and some new variables indicating selected and the probability of selection or weight.
Proc sort data=have;by group;run;
Proc surveyselect data=have out=want
samprate=5;
strata group;
run;
The resulting data set will have all of the variables in the base data and some new variables indicating selected and the probability of selection or weight.
Hi ,
You can use the starta option with all option which will create a varialbe "selected"which have value 1 if the form 5% else 0 for left 90% .
Proc sort data=have;by group;run;
Proc surveyselect data=have ALL out=want
samprate=5;
strata group;
run;
thanks guys
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.