BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Renan_Crepaldi
Obsidian | Level 7

Hi, everyone!

 

I want to make a change on the PROC SURVEYSELECT so the results are based on filters of the source table.

 

I've attached a sample of my data. The rules are:

 

  • Sample of 10% stratified by idAction
  • The 10% of each idAction have to be defined by the number of rows with control = 'N'
  • The random output have to be only rows with control = 'S'

Is there a way to do this on the procedure? Or should I use another one?

 

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User
data have;
infile cards dlm=',';
input idAction $ control $ idClient $ ;
datalines;
28004,N,40045
28004,N,40311
28004,N,40404
28004,N,35386
28004,N,162426
28004,N,163213
28004,S,149327
28004,S,163481
28004,N,163645
28004,N,149653
28004,N,157771
28004,N,303829
28004,N,304119
28004,S,290727
28004,S,286589
28004,N,304922
28004,S,286922
28004,N,292085
28004,S,450891
28004,S,506107
;;;;
run;

proc freq data=have;
where control = 'N';
table idAction / out=counts;
run;

data sampSizeSpecifications;
set counts;
_nsize_ = ceil(0.1*Count);
run;

proc surveyselect data=have method = srs sampsize = sampSizeSpecifications out=selected;
where control = 'S';
strata idAction;
run;

Are you sure you don't want PROC PSMATCH and case control matching?

View solution in original post

5 REPLIES 5
Reeza
Super User
I think you can do it piecemeal sort of?
Do a manual freq to get your samp size and pass that to the SURVEYSELECT? It can take a data set so you filter the data set for PROC FREQ using N rows and filter the survey selection data set with a where clause for the S data sets?

I think that would work....
ballardw
Super User

So people have an idea what the "data" looks like:

idAction;control;idClient
28004;N;40045
28004;N;40311
28004;N;40404
28004;N;35386
28004;N;162426
28004;N;163213
28004;S;149327
28004;S;163481
28004;N;163645
28004;N;149653
28004;N;157771
28004;N;303829
28004;N;304119
28004;S;290727
28004;S;286589
28004;N;304922
28004;S;286922
28004;N;292085
28004;S;450891
28004;S;506107

Now,

What does this mean?

The 10% of each idAction have to be defined by the number of rows with control = 'N'
The random output have to be only rows with control = 'S'

If the output only consists of records where control=S then I do not understand how "10% of each idAction have to be defined by the number of rows with control='N'.

Please describe in much more detail how the control = 'N' records are actually used. LOTS more detail.

Renan_Crepaldi
Obsidian | Level 7

Hi, @ballardw! Thanks for replying!

 

So I think you've got the idea, but to clarify more, here are some more details:

 

The clients marked with control = N are the ones targeted, and the sample must be created based on its total.

 

On the other hand, control = S are my control group, which need to be 10% of the targeted group.

 

That's why I need to "cross" these proportions.

 

I don't know if I've made myself clear (English isn't my native languague), but I'm available to give anymore informations.

 

Thanks again!

Reeza
Super User
data have;
infile cards dlm=',';
input idAction $ control $ idClient $ ;
datalines;
28004,N,40045
28004,N,40311
28004,N,40404
28004,N,35386
28004,N,162426
28004,N,163213
28004,S,149327
28004,S,163481
28004,N,163645
28004,N,149653
28004,N,157771
28004,N,303829
28004,N,304119
28004,S,290727
28004,S,286589
28004,N,304922
28004,S,286922
28004,N,292085
28004,S,450891
28004,S,506107
;;;;
run;

proc freq data=have;
where control = 'N';
table idAction / out=counts;
run;

data sampSizeSpecifications;
set counts;
_nsize_ = ceil(0.1*Count);
run;

proc surveyselect data=have method = srs sampsize = sampSizeSpecifications out=selected;
where control = 'S';
strata idAction;
run;

Are you sure you don't want PROC PSMATCH and case control matching?

Renan_Crepaldi
Obsidian | Level 7

Thank you very much, @Reeza!

 

About your observation of using the PROC PSMATCH, I'll use a method to pair the groups after, so I won't need this right now.

 

Nonetheless, thanks for the advice.

 

Regards, Renan.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1302 views
  • 1 like
  • 3 in conversation