turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Base SAS Programming
- /
- How can I extract sample with desired statistics

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-24-2017 05:42 AM

I have 1million people's score.

Each people's score is between 400 and 650;

I want to extract sample(exactly 2858 person's score information) and I also want sample score's average is 564.

How can I extract this information??

Any helps and tips will be much appreciated.

Thanks, Jamie.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to jamie0111

03-24-2017 07:27 AM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to jamie0111

03-24-2017 10:50 AM

How close to 564 must it be? If you require exactly that you may be spending some time. Do you have a desired range on the values? Standard deviation

And is this supposed to be anything resembling a random sample?

If not, then how many values do you have in the data that are 564. If the number is > 2858 then just grab them. Likely not actually useful for your purpose but would fit the bare bones of your request.

Or 1429 each of values 563 and 565

Or many other selections would have the desired mean.

I would probably start with

Proc surveyselect data=have out=want sampsize=2858;

run;

Proc mean data=want ;

var score;

run;

And see if the mean is "close enough".

This is cheap enough in time that you could even re-run the above code until you got something close.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ballardw

03-26-2017 03:28 AM

Hi!

The 2858 sample score's average does not have to be exactly 564.

I will do sampling many times until I have average 560~570.

Anyway, thanks for your big help!!

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to jamie0111

03-26-2017 09:31 PM

I can suggest that if you use startified sampling, the sampling observations can be read according to sampling weight.

Hopefully this code works for you.

%macro do_sampling;

%do %until (&avg_score ge 560 and &avg_score le 564);

proc surveyselect data=sort_sample

method=srs n=2858

seed=1234 out=sample_customer;

strata score;

run;

proc sql;

select avg(score) into :avg_score from sort_sample;

quit;

%put &avg_score;

%end;

%mend do_sampling;