i have table of  let's say 234,567 records (2 variables) and need to  randomly select 5% of each group (only 2 groups,A and B) and assign those selected  records a  variable new='randomly selected'; I like to keep the selected records within the same dataset as the non selected ones

so the resulting dataset should be like :

id   group  new

--------------

100  A     randomly selected

101  A

102  B

103 A     randomly selected

104 B

......

i know how to do it in two steps but i was wondering  if this can be done  in one  step?

Thanks,

Proc sort data=have;by group;run;

Proc surveyselect data=have out=want

samprate=5;

strata group;

run;

The resulting data set will have all of the variables in the base data and some new variables indicating selected and the probability of selection or weight.

Hi ,

You can use the starta option with all option which will create  a varialbe "selected"which have value 1 if the form 5% else 0 for left 90%  .

Proc sort data=have;by group;run;

Proc surveyselect data=have  ALL  out=want

samprate=5;

strata group;

run;

