BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
juniore
Calcite | Level 5

Hi all!

 

I have a data set with 2million obs and I want to make 7 groups out of it (6 groups with 300,000 obs and one control group with 200,000 obs). It has to be a random selection, however it has to follow this rule: 

 

There is 'country' input in the data and the random groups must have the same country percentages as the big data itself, i.e. if United States is 40% of the data, then all of these subgroups must have 40% United States in the country column. 

 

Any idea is highly appreciated!

 

Thanks 🙂

1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

Hi and welcome to the SAS Communities 🙂

 

Here is one way

 

/* Example Data */
data have;
   array c{3} $20 _temporary_ ("United States", "Canada", "Italy");
   do ID=1 to 2e6;
      Country=c[ceil(rand("Uniform")*3)];
      var1=rand('integer', 1, 100);
      var2=rand('integer', 1, 100);
      output;
   end;
run;

proc sort data=have;
   by Country;
run;

proc surveyselect data=have seed=125 out=one n=200000 outall noprint;
     strata Country / alloc=prop;
run;

data control;
   set one;
   where Selected=1;
   GroupID=0;
   keep ID Country var1 var2 GroupID;
run;

proc surveyselect data=one(where=(Selected=0)) seed=123 out=two noprint
     groups=(3e5, 3e5, 3e5, 3e5, 3e5, 3e5);
run;

data want;
   set control two(keep=ID Country var1 var2 GroupID);
run;

proc sort data=want;
   by GroupID;
run;

View solution in original post

6 REPLIES 6
yabwon
Amethyst | Level 16

Maybe rand("table")?

 

data have;
  array ccc[0:5] $ 20 _TEMPORARY_ ("Poland", "Australia", "UK", "Japan", "Mexico", "Canada");

  do I = 1 to 2e6;
    country = ccc[mod(I, 6)];
    output;
  end;
run;

proc sort 
  data = have 
  out = tmp
;
  by country;
run;

data want;
  set tmp;
  call streaminit(123);
  GRP = RAND('TABLE', 3/20,3/20,3/20,3/20,3/20,3/20,2/20);
run;

proc freq data = want;
table country * GRP;
run;

 

all the best

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



PeterClemmensen
Tourmaline | Level 20

Hi and welcome to the SAS Communities 🙂

 

Here is one way

 

/* Example Data */
data have;
   array c{3} $20 _temporary_ ("United States", "Canada", "Italy");
   do ID=1 to 2e6;
      Country=c[ceil(rand("Uniform")*3)];
      var1=rand('integer', 1, 100);
      var2=rand('integer', 1, 100);
      output;
   end;
run;

proc sort data=have;
   by Country;
run;

proc surveyselect data=have seed=125 out=one n=200000 outall noprint;
     strata Country / alloc=prop;
run;

data control;
   set one;
   where Selected=1;
   GroupID=0;
   keep ID Country var1 var2 GroupID;
run;

proc surveyselect data=one(where=(Selected=0)) seed=123 out=two noprint
     groups=(3e5, 3e5, 3e5, 3e5, 3e5, 3e5);
run;

data want;
   set control two(keep=ID Country var1 var2 GroupID);
run;

proc sort data=want;
   by GroupID;
run;
juniore
Calcite | Level 5

this worked perfectly, thanks a lot!

PeterClemmensen
Tourmaline | Level 20

Anytime, glad to help 🙂

ballardw
Super User

I would start with something like:

proc surveyselect data=have out=want groups=(300000, 300000, 300000, 300000, 300000, 300000, 200000);
run;

 

 

Reeza
Super User

1. Use PROC SURVEYSELECT

2. Since you want to match by proportion of country, make your strata country.

 

The documentation has a fully worked example of proportional allocation here:

https://documentation.sas.com/?docsetId=statug&docsetVersion=14.3&docsetTarget=statug_surveyselect_e...

 

 

 

 

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

Creating Custom Steps in SAS Studio

Check out this tutorial series to learn how to build your own steps in SAS Studio.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 2250 views
  • 3 likes
  • 5 in conversation