08-28-2013 07:41 AM
Hi I want to get a random sample from a table but I want output number as %25 of the table count
here is the code:
proc surveyselect data=temp.tableA method = urs sampsize = ???
rep=1 seed=12345 out=temp.sonuc out=outhits;
id bayi pos key msisdn;
proc print data = temp.sonuc noobs;
how can I put sampsize=count(tableA)/4
or the %25
for example table has 1000 row
but I need output number 250
but I do not put it hard coded
it must calculate the count of table then it must take 1/4 of the numberr
08-28-2013 07:53 AM
You could create a macro variable based on the number of obs in the dataset using the following.
%LET DSID = %SYSFUNC(OPEN(SASHELP.CLASS));
%LET NUMOBS = %SYSFUNC(ATTRN(&DSID.,NOBS));
%LET SAMP = %SYSEVALF(&NUMOBS./4,CEIL);
%LET RC = %SYSFUNC(CLOSE(&DSID));
%PUT *****************************&NUMOBS. &SAMP.;
Edited to make use of the CEIL modifier.
08-28-2013 08:18 AM
I made what you said but it brings to me not exactly 1/4 of the table
for example my table has 425019 row
but the code brings to me 94239 row
I changed the extra features from the code but it did not change
proc surveyselect data=temp.tableA method = urs samprate = 0.25
seed=12345 out=temp.sonuc ;
08-28-2013 08:38 AM
This is due to the sampling method you are using (URS), which allows replacement (i.e. it can select the same observation more than once). The output dataset will only contain the unique values, so any duplicate observations have been dropped. If you change the method to SRS then you will always get 25% of the observations each time.
I suggest you read up on the PROC SURVEYSELECT documentation, this gives you all the answers to your questions.
08-28-2013 08:00 AM
Can't you SAMPRATE= option.
specifies the sampling rate, which is the proportion of units to select for the sample. The sampling rate r must be a positive number. You can specify r as a number between 0 and 1. Or you can specify r in percentage form as a number between 1 and 100, and PROC SURVEYSELECT converts that number to a proportion. The procedure treats the value 1 as 100% instead of 1%.