Hi Experts,
I have never done sampling in SAS. I have a population table with a million rows. I will have to extract 500 entries that meet the following criteria.
How can I do this in SAS.
Thanks in advance.
Hi, you can use the proc surveyselect :
Data have ;
set Yourtable ;
keep var1 var2 var3 ;
run ;
proc surveyselect data=have
method=srs n=500 out=SampleSRS;
run;
Hi, you can use the proc surveyselect :
Data have ;
set Yourtable ;
keep var1 var2 var3 ;
run ;
proc surveyselect data=have
method=srs n=500 out=SampleSRS;
run;
Hello,
I have a follow up to this questions. I need random samples. However if the number of observations from a 15% sample are less than 10 records, then I need the program to select 10 random records. Is there any way to do this?
You can use the NMIN= option to specify the minimum sample size.
proc surveyselect rate=.15 nmin=10;
Thank you very much!
I just want to point out that the question says
> The sample should include all available distinct values from three columns
The PROC SURVEYSELECT code that you marked as "correct" does not necessarily "include all available distinct values." It simply extracts 500 random observations.
In general, it might be impossible to satisfy that constraint. For example, if X1 = _N_, then there are 1 million distinct values and no subset of 500 observations can include all distinct values. If you want to include all distinct values, you would have to sort the data, then use the FIRST.VAR technique to extract the distinct combinations:
proc sort data=have;
by x1 - x3;
run;
data distinct;
set have;
by x1 - x3;
if first.x1 | first.x2 | first.x3;
run;
This method is unlikely to create 500 observations.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.