Good Day,
I am trying to use a SurveySelect to select N records from several datasets. I need 20 records, but some of the datasets sometimes have less, which breaks the proc. I seem to recall a way to account for this and simple select the maximum number of records when N <20, but can't remember how to code it. here is the proc :
PROC SURVEYSELECT DATA=WORK.SORTTempTableSorted
OUT=WORK.Random_Sample
METHOD=SRS
N=20;
STRATA Type LOB / ALLOC=PROP;
RUN;
QUIT;
Any help is greatly appreciated.
-John
Look at the SELECTALL option.
PG
SELECTALL does not seem to fit this situation. I need a maximum of 20 rows. If I'm not understanding, please explain.
If you specify the SELECTALL option, PROC SURVEYSELECT selects all stratum units when the stratum sample size exceeds the number of units in the stratum.
Try this macro program that will execute if the number of observations are greater than or equal to 20 in the sampled dataset.
%macro sel (ds=);
proc sql;
select count(*) into :size from &ds;
quit;
%if &size>=20 %then %do;
PROC SURVEYSELECT DATA=&ds
OUT=WORK.Random_Sample
METHOD=SRS
N=20;
STRATA Type LOB / ALLOC=PROP;
RUN;
%end;
%else;
%put "Dataset: &ds - does not cotain enough observations to sample";
%mend sel;
%sel (ds=have);
jwillis : I have already read through the documentation, that's why I am asking here
stat@sas : The macro program won't do, because I need whatever number of records are in the dataset if less than 20. Just putting a message that says there are not enough is not an option.
Hope this will solve the problem. If dataset contains 20 or more then it will select a sample of 20 otherwise it will select whatever is in the dataset.
%macro sel (ds=);
proc sql;
select count(*) into :size from &ds;
quit;
%if &size>=20 %then %do;
PROC SURVEYSELECT DATA=&ds
OUT=WORK.Random_Sample
METHOD=SRS
N=20;
STRATA / ALLOC=PROP;
RUN;
%end;
%else %do;
PROC SURVEYSELECT DATA=&ds
OUT=WORK.Random_Sample
METHOD=SRS
N=&size;
STRATA / ALLOC=PROP;
RUN;
%end;
%mend sel;
%sel (ds=have);
No problem. Many people do not look at the documentation first.
You could calculate your own sample sizes, as in this example:
/* An example dataset with
file1 : 18 units in 6 strata
file2 : 28 units in 4 strata */
data test;
file = "file1";
do type = 1 to 2;
do LOB = 1 to 3;
do i = 1 to 3;
x + 1;
output;
end;
end;
end;
file = "file2";
do type = 1 to 2;
do LOB = 1 to 2;
do i = 1 to 7;
x + 1;
output;
end;
end;
end;
drop i;
run;
/* Compute the sample sizes */
%let sampleSize=20;
proc sql;
create table sampSize as
select file, type, LOB,
min(strataSize, ceil(&sampleSize.*strataSize/sum(strataSize))) as _NSIZE_
from (
select file, type, LOB, count(*) as strataSize
from test
group by file, type, LOB)
group by file;
quit;
/* Extract the sample */
PROC SURVEYSELECT DATA=test
OUT=testOut
METHOD=SRS
N=sampSize;
STRATA file Type LOB;
RUN;
PG
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.