Advanced problem for Strata Sampling issue

Accepted Solution Solved
Reply
Contributor
Posts: 59
Accepted Solution

Advanced problem for Strata Sampling issue

Hi All,

Recently my problem is resolved for adding option SELECTALL in strata sampling to include all obs within one strata.

The code is:

proc sort data=filein;

  by var1;

proc surveyselect  data=filein method=srs  sampsize=20

seed=12345  out=fileout  SELECTALL;

strata var1;

run;

My real problem is that:

Strata var1    Ideal sample              Situation 1                                     Situation 2

value:            size                         Real      Want                                Real     Want

A                  20                           10            10                                   10         10

B                  20                           30            25                                   15         15

C                  20                           50            25                                   50         35

                    60                                          60                                                 60

I want the final sample size (include A,B and C) should be 60.

How do we code the logic like that ?

Thanks for your support information.

Regards.

William


Accepted Solutions
Solution
‎01-18-2015 04:13 PM
Respected Advisor
Posts: 4,919

Re: Advanced problem for Strata Sampling issue

Posted in reply to wtien196838

To get exactly what you want you will need to do some of the work yourself, as in this example based on sashelp.class :

/*Get the strata sizes, ordered by increasing size */

proc sql;

create table strata as

select age, count(*) as real

from sashelp.class

group by age

order by real;

quit;

/* Set the total sample size */

%let totalSample=12;

/* Calculate effective sample sizes by iteratively allocating samples equally among leftover strata, starting with the smallest stratum */

data sizes;

retain sampleLeft (&totalSample);

set strata nobs=nStrata;

SampleSize = min(real, round(sampleLeft/(nStrata-_n_+1)));

sampleLeft + (-SampleSize);

drop sampleLeft;

run;

/* Give the same stratum order to data and strata sizes, as required by proc surveyselect */

proc sort data=sashelp.class out=class; by age; run;

proc sort data=sizes; by age; run;

/* Call surveyselect to do the random sampling with the calculated strata sizes */

proc surveyselect data=class out=mySample seed=85687 selectall sampsize=sizes;

strata age;

run;

PG

Message was edited by: PG Added comments.

PG

View solution in original post


All Replies
Super User
Posts: 19,768

Re: Advanced problem for Strata Sampling issue

Posted in reply to wtien196838

I'm not 100% sure what you're after but look at the SAMPSIZE= option for your version of proc surveyselect. You can either specify it in another dataset or list it out. The order needs to match the stratum order.

proc surveyselect  data=filein method=srs  sampsize=(10 25 25)

seed=12345  out=fileout  SELECTALL;

strata var1;

run;

OR

proc surveyselect  data=filein method=srs  sampsize=(20 20 20)

seed=12345  out=fileout  SELECTALL;

strata var1;

run;

Solution
‎01-18-2015 04:13 PM
Respected Advisor
Posts: 4,919

Re: Advanced problem for Strata Sampling issue

Posted in reply to wtien196838

To get exactly what you want you will need to do some of the work yourself, as in this example based on sashelp.class :

/*Get the strata sizes, ordered by increasing size */

proc sql;

create table strata as

select age, count(*) as real

from sashelp.class

group by age

order by real;

quit;

/* Set the total sample size */

%let totalSample=12;

/* Calculate effective sample sizes by iteratively allocating samples equally among leftover strata, starting with the smallest stratum */

data sizes;

retain sampleLeft (&totalSample);

set strata nobs=nStrata;

SampleSize = min(real, round(sampleLeft/(nStrata-_n_+1)));

sampleLeft + (-SampleSize);

drop sampleLeft;

run;

/* Give the same stratum order to data and strata sizes, as required by proc surveyselect */

proc sort data=sashelp.class out=class; by age; run;

proc sort data=sizes; by age; run;

/* Call surveyselect to do the random sampling with the calculated strata sizes */

proc surveyselect data=class out=mySample seed=85687 selectall sampsize=sizes;

strata age;

run;

PG

Message was edited by: PG Added comments.

PG
Contributor
Posts: 59

Re: Advanced problem for Strata Sampling issue

Posted in reply to wtien196838

I explain my sampling method.

If those strata (A,B,C) have obs larger than sample size 20, then it is easy to use proc surveyexpect with sampsize=(20 20 20).

If A only has 10 obs then the missing 10 obs are distributed evenly to B and C, sample size should be=(10 25 25).

If A has 10 obs and B has 15 obs then the remain sample is on C, sample size should be=(10 15 35)

Since I need sampling 75 samples (60 obs/each sample) in a short period of time. I do not have time to check the number of obs in each strata.

I need to use SAS code to do this task.

Hope this explanation can clarify my question.

Thanks for your suggestion.

William

Super User
Posts: 19,768

Re: Advanced problem for Strata Sampling issue

Posted in reply to wtien196838

Is the sample required always 60? Does @pgstats solution work?

Respected Advisor
Posts: 4,919

Re: Advanced problem for Strata Sampling issue

Posted in reply to wtien196838

I added comments to my code to explain how it implements your sampling method, exactly, for the sashelp.class dataset stratified by age with a total sample size of 12. - PG

PG
Contributor
Posts: 59

Re: Advanced problem for Strata Sampling issue

Posted in reply to wtien196838

It works. Thanks so much PGStats.  Your code is genius. It took me more time to understand it and apply it.  This is the code I want.

Thanks for all contributors again.

Regards,

William

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 289 views
  • 0 likes
  • 3 in conversation