i have table of let's say 234,567 records (2 variables) and need to randomly select 5% of each group (only 2 groups,A and B) and assign those selected records a variable new='randomly selected'; I like to keep the selected records within the same dataset as the non selected ones
so the resulting dataset should be like :
id group new
--------------
100 A randomly selected
101 A
102 B
103 A randomly selected
104 B
......
i know how to do it in two steps but i was wondering if this can be done in one step?
Thanks,
Proc sort data=have;by group;run;
Proc surveyselect data=have out=want
samprate=5;
strata group;
run;
The resulting data set will have all of the variables in the base data and some new variables indicating selected and the probability of selection or weight.
Proc sort data=have;by group;run;
Proc surveyselect data=have out=want
samprate=5;
strata group;
run;
The resulting data set will have all of the variables in the base data and some new variables indicating selected and the probability of selection or weight.
Hi ,
You can use the starta option with all option which will create a varialbe "selected"which have value 1 if the form 5% else 0 for left 90% .
Proc sort data=have;by group;run;
Proc surveyselect data=have ALL out=want
samprate=5;
strata group;
run;
thanks guys
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and save with the early bird rate—just $795!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.