I wrote the below code to select 10 samples randomly based on “class” distribution.
input Class $ Name $ 3-11 Marks ;
A student1 100
A student2 100
A student3 90
A student4 80
A student5 70
A student6 60
A student7 50
A student8 40
A student9 30
A student10 20
B student11 100
B student12 90
B student13 90
B student14 80
B student15 70
B student16 60
C student17 100
C student18 100
C student19 100
C student20 50
/* select random samples based on the proportion */
proc surveyselect data = list out = list_sample method = srs sampsize=10 seed = 9876;
strata class / alloc=proportional;
proc freq data=list_sample;
For the output samples, only 1 100 marks was selected in each class (totally 100 marks cases=3)
But now I want to add one more requirement:
All marks=100(totally 6 cases here) should be included in the samples.
How can I ensure this? Then I tried usually certainty sampling with the below code
if marks=100 then ranking=3;
proc surveyselect data = list2 out = list2_sample method = pps certsize=2 sampsize=10 seed = 9876;
strata class / alloc=proportional;
proc freq data=list2_sample;
All 100 marks cases in class A and B were selected, but there is error for class C:
ERROR: The number of certainty units exceeds the specified sample size.
My idea is, if number of certainty units exceeds the specified sample size, then just randomly choose among those certainty units to meet the sample size. But I don't know how to fix this problem.
My ultimate result is to select base on proportion to the number of observation (that's why I use proportional allocation), and to select all marks=100 with first priority, the remaining to be selected randomly from the pool.
Any better idea to solve the problem? Many thanks for the help!
sample size based on proportion no. of 100 marks in population samples should be taken from
A: 5 2 2 100 marks + 3 radom
B: 3 1 1 100 marks + 2 random
C: 2 3 random select 2 100 marks out of the 3
Actually certainty sampling works good for A&B, but just I cannot figure out any method to tell SAS to do for C as it is now a error that certainity units exceed stratum sample size.
Or any better way to solve the problem without using certainty sampling?
/* Get the proportional allocation */ proc surveyselect data = list out = list_alloc method = srs sampsize=10 seed = 9876; strata class / alloc=proportional nosample; run; /* Substract the marks=100 from allocation */ proc sql; create table list_select as select class, max(0, SampleSize - (select sum(marks=100) from list where class=a.class)) as sampleSize from list_alloc as a; quit; /* Select remaining samples */ proc surveyselect data = list (where=(marks ne 100)) out = list_sample method = srs sampsize=list_select seed = 9876; strata class; run; /* Join the marks=100 students to the random samples */ proc sql; create table list_final_sample as select * from list where marks=100 union all corr select * from list_sample order by class, name; select * from list_final_sample; quit;
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.