03-24-2017 05:42 AM
I have 1million people's score.
Each people's score is between 400 and 650;
I want to extract sample(exactly 2858 person's score information) and I also want sample score's average is 564.
How can I extract this information??
Any helps and tips will be much appreciated.
03-24-2017 10:50 AM
How close to 564 must it be? If you require exactly that you may be spending some time. Do you have a desired range on the values? Standard deviation
And is this supposed to be anything resembling a random sample?
If not, then how many values do you have in the data that are 564. If the number is > 2858 then just grab them. Likely not actually useful for your purpose but would fit the bare bones of your request.
Or 1429 each of values 563 and 565
Or many other selections would have the desired mean.
I would probably start with
Proc surveyselect data=have out=want sampsize=2858;
Proc mean data=want ;
And see if the mean is "close enough".
This is cheap enough in time that you could even re-run the above code until you got something close.
03-26-2017 03:28 AM
The 2858 sample score's average does not have to be exactly 564.
I will do sampling many times until I have average 560~570.
Anyway, thanks for your big help!!
03-26-2017 09:31 PM
I can suggest that if you use startified sampling, the sampling observations can be read according to sampling weight.
Hopefully this code works for you.
%do %until (&avg_score ge 560 and &avg_score le 564);
proc surveyselect data=sort_sample
select avg(score) into :avg_score from sort_sample;