Dear All:
I have over a hundred ID values. I want to extract just 5 to conduct a preliminary analysis.
The code I wrote
Proc sort data = have ; by ID ; run;
Data want; set have;
If ID = '345' or ID = '650' or ID = '656' or ID = '700' or ID = '725' then output;
run;
Is there a shorter way of writing this code to extract the first 5 ID values.
Randy
proc sort data=have out=havesorted;
by ID;
run;
data want;
set havesorted(obs=5);
run;
If you just want a random sample. PROC SURVEYSELECT is probably the best choice.
proc surveyselect data = sashelp.cars
sampsize = 5
out = work.cars_sample;
;
run;
If you want the sample to be repeatable, i.e. to get the same "random" sample the next time you run the program, you must specify a seed:
proc surveyselect data = sashelp.cars
seed=12345
sampsize = 5
out = work.cars_sample;
;
run;
You really want the FIRST 5 of the sorted dataset? Then:
proc sort data=have out=want;
by id;
run;
data want5;
set want (obs=5);
run;
But why not a random 5? (eidt: I added the overlooked "%let size=5;").
%let size=5;
data want5 (drop=i already_sampled: );
array already_sampled {&size} ; /*Record obs num of selected obs */
i=1; /*To index the array already_sampled */
/* Keep going until already_sampled array is full */
do until (nmiss(of already_sampled{*})=0);
p=ceil(rand('uniform',nrecs)); /*Random integer from 1 through NRECS */
if whichn(p,of already_sampled{*})^=0 then continue; /*Skip below if this P already retrieved*/
set have nobs=nrecs point=p; /*Read the p'th obs, note NRECS is size of dataset HAVE*/
output;
already_sampled{i}=p;
i=i+1;
end;
stop;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.