Programming the statistical procedures from SAS

Creating a conditional one to one random control to match a test group

Reply
Occasional Contributor
Posts: 10

Creating a conditional one to one random control to match a test group

I have a client test group that differs based on store location and client spend group ( high, medium low).

 

I have a larger group of similar control clients created based on past history.  I would like to select a random control client for each test client that shares the same store location and spend level.

 

example of tables:

 

test:  

 

client id store_number spend_level
1 30 1
2 5 1
3 10 2
4 8 2
5 8 3

 

 

I have a potential control group that looks like this but with many more clients, and not weighted the same by store and spend.

How do I create a control sample that has the same number of clients by store and spend.

 

So if Store 3 has 10 low spend, 20 high spend and 3 medium spend clients, and store 5 has 2 low spend, 1 high spend, and 8 medium spend clients, the control will have the same.

I'd like this sample to be random within these conditions. 

 

thank you in advance!

Meredith

 

 

 

SAS Employee
Posts: 51

Re: Creating a conditional one to one random control to match a test group

You can do that with PROC SURVEYSELECT by specifying a secondary input data set (SAMPSIZE=).

 

PROC SURVEYSELECT data=input_SAS_dataset_ToSampleFrom /* control clients */

     SAMPSIZE=Secondary_Input_Data_Set

     out=SampleSRSwithinStrata

     method=srs ; /* srs = Simple Random Sampling */

strata store_number spend_level;

run;

 

You can provide stratum sample sizes in the _NSIZE_ (or SampleSize) variable in the SAMPSIZE= data set.

The secondary input data set must contain all the STRATA variables, with the same type and length as in the DATA= data set. The STRATA groups should appear in the same order in the secondary data set as in the DATA= data set.

 

SAS/STAT(R) 14.1 User's Guide
The SURVEYSELECT Procedure
secondary input data set
http://support.sas.com/documentation/cdl/en/statug/68162/HTML/default/viewer.htm#statug_surveyselect...

 

You can get the required stratum sample sizes with a PROC FREQ on your client test group.

PROC FREQ; tables store_number * spend_level / out=count; run;

 

Good luck with this!

 

SAS Employee
Posts: 51

Re: Creating a conditional one to one random control to match a test group

I forgot to say that, for the SURVEYSELECT solution, you should put a SEED= value that is a (arbitrary) strictly positive integer (>0). Otherwise your results are not reproducible. 

 

From the online doc ...

SEED=number
specifies the initial seed for random number generation. The SEED= value must be a positive integer. If you do not specify the SEED= option, or if the SEED= value is negative or 0, PROC SURVEYSELECT uses the time of day from the computer’s clock to obtain the initial seed.

 

Koen 

Respected Advisor
Posts: 4,751

Re: Creating a conditional one to one random control to match a test group

You can also do your own randomisation and gain greater control

 

proc sql;
create table numbers as
select store_number, spend_level, count(*) as n
from test
group by store_number, spend_level;
quit;

data rndControls;
call streaminit(78576);
set controls;
rnd = rand("UNIFORM");
run;

proc sort data=rndControls; by store_number spend_level rnd; run;

data matchControls;
merge numbers rndControls (in=inControls); by store_number spend_level;
if first.spend_level then count = 0;
if count <= n then do;
    count + inControls;
    if last.spend_level and count < n 
        then missing = n - count;
    output;
    end;
keep store_number spend_level clientId missing;
run;

proc print data=matchControls noobs; run;

 

the idea is to put the controls in random order within store_number and spend_level groups and then pick the first control clients that are required to match the test groups. 

PG
Ask a Question
Discussion stats
  • 3 replies
  • 297 views
  • 0 likes
  • 3 in conversation