BookmarkSubscribeRSS Feed
sayginf
Calcite | Level 5

Hi I want to get a random sample from a table but I  want output number as %25 of the table count

here is the code:

proc surveyselect data=temp.tableA  method = urs sampsize = ???

   rep=1 seed=12345 out=temp.sonuc out=outhits;

   id bayi pos key  msisdn;

  

run;

proc print data = temp.sonuc noobs;

run;

how can I put sampsize=count(tableA)/4  

or the %25

for example table has 1000 row

but I need output number 250

but I do not put it hard coded

it must calculate the count of table then it must take 1/4 of the numberr Smiley Sad

9 REPLIES 9
Scott_Mitchell
Quartz | Level 8

You could create a macro variable based on the number of obs in the dataset using the following.

%LET DSID = %SYSFUNC(OPEN(SASHELP.CLASS));

%LET NUMOBS = %SYSFUNC(ATTRN(&DSID.,NOBS));

%LET SAMP = %SYSEVALF(&NUMOBS./4,CEIL);

%LET RC   = %SYSFUNC(CLOSE(&DSID));

%PUT *****************************&NUMOBS. &SAMP.;

Edited to make use of the CEIL modifier.

sayginf
Calcite | Level 5

thank you but I do not know how to write macro yet Smiley Happy

but I will learn in near future

Keith
Obsidian | Level 7

You need to use the SAMPRATE option instead of SAMPSIZE.

proc surveyselect data=temp.tableA  method = urs samprate = 0.25

sayginf
Calcite | Level 5

I made what you said but it brings to me not exactly 1/4 of the table

for example my table has 425019 row

but the code brings to me 94239 row

I changed the extra features from the code but it did not change

proc surveyselect data=temp.tableA method = urs samprate = 0.25

  seed=12345 out=temp.sonuc ;

run;

Keith
Obsidian | Level 7

This is due to the sampling method you are using (URS), which allows replacement (i.e. it can select the same observation more than once).  The output dataset will only contain the unique values, so any duplicate observations have been dropped.  If you change the method to SRS then you will always get 25% of the observations each time.

I suggest you read up on the PROC SURVEYSELECT documentation, this gives you all the answers to your questions.

sayginf
Calcite | Level 5

Thank you very much,

I will read of course but this was urgent.

data_null__
Jade | Level 19

I would never trust an urgent need to the whims of a user forum.

Keith
Obsidian | Level 7

Likewise I would never run some SAS code without understanding what it was doing!

data_null__
Jade | Level 19

Can't you SAMPRATE= option.

SAMPRATE=r

RATE=r

specifies the sampling rate, which is the proportion of units to select for the sample. The sampling rate r must be a positive number. You can specify r as a number between 0 and 1. Or you can specify r in percentage form as a number between 1 and 100, and PROC SURVEYSELECT converts that number to a proportion. The procedure treats the value 1 as 100% instead of 1%.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1087 views
  • 7 likes
  • 4 in conversation