BookmarkSubscribeRSS Feed
RandyStan
Fluorite | Level 6

Dear All:

   I have over a hundred ID values.  I want to extract just 5 to conduct a preliminary analysis.

 

The code I wrote

Proc sort data = have ; by ID ; run;

Data want; set have;

If ID = '345' or ID = '650' or ID = '656' or ID = '700' or ID = '725' then output;

run;

Is there a shorter way of writing this code to extract the first 5 ID values.

Randy

 

3 REPLIES 3
SASJedi
Ammonite | Level 13
proc sort data=have out=havesorted;
by ID;
run; data want; set havesorted(obs=5); run;
Check out my Jedi SAS Tricks for SAS Users
SASJedi
Ammonite | Level 13

If you just want a random sample. PROC SURVEYSELECT is probably the best choice. 

proc surveyselect data = sashelp.cars 
   sampsize = 5 
   out = work.cars_sample;
   ;
run;

If you want the sample to be repeatable, i.e. to get the same "random" sample the next time you run the program, you must specify a seed:

proc surveyselect data = sashelp.cars 
   seed=12345
   sampsize = 5 
   out = work.cars_sample;
   ;
run;
Check out my Jedi SAS Tricks for SAS Users
mkeintz
PROC Star

You really want the FIRST 5 of the sorted dataset?  Then:

 

 

proc sort data=have out=want;
  by id;
run;
data want5;
  set want (obs=5);
run;

 

But why not a random 5?  (eidt: I added the overlooked "%let size=5;").

%let size=5;
data want5 (drop=i already_sampled: );
  array already_sampled {&size} ; /*Record obs num of selected obs */

  i=1;   /*To index the array already_sampled */

  /* Keep going until already_sampled array is full */
  do until (nmiss(of already_sampled{*})=0);
    p=ceil(rand('uniform',nrecs));                       /*Random integer from 1 through NRECS */
    if whichn(p,of already_sampled{*})^=0 then continue; /*Skip below if this P already retrieved*/
    set have nobs=nrecs point=p;                         /*Read the p'th obs, note NRECS is size of dataset HAVE*/
    output;
    already_sampled{i}=p;
    i=i+1;
  end;
  stop;
run;

 

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1157 views
  • 0 likes
  • 3 in conversation