SAS Programming

DATA Step, Macro, Functions and more
BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
twildone
Pyrite | Level 9

Hi...I am trying to select 25 clients with 1 record for each client. Each client will have more than 1 record. When I run the code below, I end up with only 23 clients but when I change the samplesize from 25 to 27, I end up with 25 clients which is what I wanted. Any suggestions how to correct this.....Thanks.

 

proc surveyselect data = SUMMARY94 method = URS rep = 1 sampsize = 25 seed = 12345 out = hsbs3;

id _ALL_;

SAMPLINGUNIT CLIENT_ID;

run;

 

DATA hsbs2;

     SET hsbs3;

do sampleUnitID=1 to numberhits;

     output;

     end;

   run;

 

PROC SORT DATA=hsbs2;

     BY REPLICATE CLIENT_ID SAMPLEUNITID;

RUN;

 

DATA hsbs1;

     SET hsbs2;

     BY REPLICATE CLIENT_ID SAMPLEUNITID;

           IF FIRST.REPLICATE THEN SAMPLEID=0;

           IF FIRST.SAMPLEUNITID THEN SAMPLEID+1;

RUN;

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hi @twildone,

 

The reason for the number of clients being less than the specified sample size is that you used unrestricted random sampling (URS), which implies sampling with replacement.

 

I think, you should follow a two-stage approach:

/* Stage 1: Select all records of 25 randomly selected clients (simple random sampling of clients) */

proc surveyselect data=summary94 n=25 seed=31415 out=stage1;
samplingunit client_id;
run;

/* Stage 2: Select one record per client (simple random sampling of records, stratified by client) */

proc surveyselect data=stage1 n=1 seed=27182 out=hsbs(drop=SelectionProb SamplingWeight);
strata client_id;
run;

(Edit: Dropped variables SelectionProb and SamplingWeight from output dataset assuming these are not needed.)

View solution in original post

3 REPLIES 3
stat_sas
Ammonite | Level 13

Try this.

 

proc sort data=SUMMARY94;
by CLIENT_ID;
run;

 

proc surveyselect data = SUMMARY94 method = URS sampsize = 1 seed = 12345 out = hsbs3;
strata CLIENT_ID;
run;

FreelanceReinh
Jade | Level 19

Hi @twildone,

 

The reason for the number of clients being less than the specified sample size is that you used unrestricted random sampling (URS), which implies sampling with replacement.

 

I think, you should follow a two-stage approach:

/* Stage 1: Select all records of 25 randomly selected clients (simple random sampling of clients) */

proc surveyselect data=summary94 n=25 seed=31415 out=stage1;
samplingunit client_id;
run;

/* Stage 2: Select one record per client (simple random sampling of records, stratified by client) */

proc surveyselect data=stage1 n=1 seed=27182 out=hsbs(drop=SelectionProb SamplingWeight);
strata client_id;
run;

(Edit: Dropped variables SelectionProb and SamplingWeight from output dataset assuming these are not needed.)

twildone
Pyrite | Level 9

Thanks....it worked perfectly!!!!!

sas-innovate-white.png

Join us for our biggest event of the year!

Four days of inspiring keynotes, product reveals, hands-on learning opportunities, deep-dive demos, and peer-led breakouts. Don't miss out, May 6-9, in Orlando, Florida.

 

View the full agenda.

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1919 views
  • 0 likes
  • 3 in conversation