Hi, everyone!
I want to make a change on the PROC SURVEYSELECT so the results are based on filters of the source table.
I've attached a sample of my data. The rules are:
Is there a way to do this on the procedure? Or should I use another one?
Thanks!
data have;
infile cards dlm=',';
input idAction $ control $ idClient $ ;
datalines;
28004,N,40045
28004,N,40311
28004,N,40404
28004,N,35386
28004,N,162426
28004,N,163213
28004,S,149327
28004,S,163481
28004,N,163645
28004,N,149653
28004,N,157771
28004,N,303829
28004,N,304119
28004,S,290727
28004,S,286589
28004,N,304922
28004,S,286922
28004,N,292085
28004,S,450891
28004,S,506107
;;;;
run;
proc freq data=have;
where control = 'N';
table idAction / out=counts;
run;
data sampSizeSpecifications;
set counts;
_nsize_ = ceil(0.1*Count);
run;
proc surveyselect data=have method = srs sampsize = sampSizeSpecifications out=selected;
where control = 'S';
strata idAction;
run;
Are you sure you don't want PROC PSMATCH and case control matching?
So people have an idea what the "data" looks like:
idAction;control;idClient 28004;N;40045 28004;N;40311 28004;N;40404 28004;N;35386 28004;N;162426 28004;N;163213 28004;S;149327 28004;S;163481 28004;N;163645 28004;N;149653 28004;N;157771 28004;N;303829 28004;N;304119 28004;S;290727 28004;S;286589 28004;N;304922 28004;S;286922 28004;N;292085 28004;S;450891 28004;S;506107
Now,
What does this mean?
The 10% of each idAction have to be defined by the number of rows with control = 'N' The random output have to be only rows with control = 'S'
If the output only consists of records where control=S then I do not understand how "10% of each idAction have to be defined by the number of rows with control='N'.
Please describe in much more detail how the control = 'N' records are actually used. LOTS more detail.
Hi, @ballardw! Thanks for replying!
So I think you've got the idea, but to clarify more, here are some more details:
The clients marked with control = N are the ones targeted, and the sample must be created based on its total.
On the other hand, control = S are my control group, which need to be 10% of the targeted group.
That's why I need to "cross" these proportions.
I don't know if I've made myself clear (English isn't my native languague), but I'm available to give anymore informations.
Thanks again!
data have;
infile cards dlm=',';
input idAction $ control $ idClient $ ;
datalines;
28004,N,40045
28004,N,40311
28004,N,40404
28004,N,35386
28004,N,162426
28004,N,163213
28004,S,149327
28004,S,163481
28004,N,163645
28004,N,149653
28004,N,157771
28004,N,303829
28004,N,304119
28004,S,290727
28004,S,286589
28004,N,304922
28004,S,286922
28004,N,292085
28004,S,450891
28004,S,506107
;;;;
run;
proc freq data=have;
where control = 'N';
table idAction / out=counts;
run;
data sampSizeSpecifications;
set counts;
_nsize_ = ceil(0.1*Count);
run;
proc surveyselect data=have method = srs sampsize = sampSizeSpecifications out=selected;
where control = 'S';
strata idAction;
run;
Are you sure you don't want PROC PSMATCH and case control matching?
Thank you very much, @Reeza!
About your observation of using the PROC PSMATCH, I'll use a method to pair the groups after, so I won't need this right now.
Nonetheless, thanks for the advice.
Regards, Renan.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.