hello,
I have a team with 29 employees (it could change) in 5 (it could change) groups .see table HAVE. some groups have 5 employees, and some groups have 6 employees
What I would like to obtain : divide each group into two subgroup with equal number employees whenever it is possible
for groups with 5 employees, get a subgroup number that is 1,1,1,2,2
for groups with 6 employees get a subgroup number that is 1,1,1,2,2,2
many thanks in advance for your help
Data have ; infile datalines ; input name $ group $1. subgroup $1. ; datalines ; Andrew 1 Barbara 1 Boris 1 Bryan 1 Carla 1 Christophe 2 David 2 Emma 2 Eric 2 Fred 2 Henry 2 Hugo 3 Jamal 3 Jeff 3 Kylian 3 Lionel 3 Maelle 3 Martin 4 Michael 4 Nadia 4 Nasser 4 Paul 4 Robert 4 Sacha 5 Salah 5 Samuel 5 Sophie 5 Stefan 5 Zakia 5 ; run ;
@Nasser_DRMCP wrote:
hello,
I have a team with 29 employees (it could change) in 5 (it could change) groups .see table HAVE. some groups have 5 employees, and some groups have 6 employees
What I would like to obtain : divide each group into two subgroup with equal number employees whenever it is possible
for groups with 5 employees, get a subgroup number that is 1,1,1,2,2
for groups with 6 employees get a subgroup number that is 1,1,1,2,2,2
many thanks in advance for your help
Data have ; infile datalines ; input name $ group $1. subgroup $1. ; datalines ; Andrew 1 Barbara 1 Boris 1 Bryan 1 Carla 1 Christophe 2 David 2 Emma 2 Eric 2 Fred 2 Henry 2 Hugo 3 Jamal 3 Jeff 3 Kylian 3 Lionel 3 Maelle 3 Martin 4 Michael 4 Nadia 4 Nasser 4 Paul 4 Robert 4 Sacha 5 Salah 5 Samuel 5 Sophie 5 Stefan 5 Zakia 5 ; run ;
This creates a random subgroup:
Data have ;
infile datalines ;
input name $ group $1. ;
datalines ;
Andrew 1
Barbara 1
Boris 1
Bryan 1
Carla 1
Christophe 2
David 2
Emma 2
Eric 2
Fred 2
Henry 2
Hugo 3
Jamal 3
Jeff 3
Kylian 3
Lionel 3
Maelle 3
Martin 4
Michael 4
Nadia 4
Nasser 4
Paul 4
Robert 4
Sacha 5
Salah 5
Samuel 5
Sophie 5
Stefan 5
Zakia 5
;
proc sort data=have;
by group;
run;
proc surveyselect data=have
out=want(rename=(groupid=subgroup))
groups=2;
strata group;
run;
The proc sort is needed if your data is not sorted by the group to allow use of group as the Strata variable.
The Groups= option randomly assigns each record to one of the number indicated for each strata. The rename on the out= option is to change the name of the automatic variable Groupid created by the Groups= option. Subgroup will be numeric in the output. If you need a character variable then use the Want data set as the input to a data step to create the desired character variable.
Suggestion: Group is an option name in multiple SAS procedures. You may want to use a different variable name then group to avoid possible confusion between your variable and the option.
@Nasser_DRMCP wrote:
hello,
I have a team with 29 employees (it could change) in 5 (it could change) groups .see table HAVE. some groups have 5 employees, and some groups have 6 employees
What I would like to obtain : divide each group into two subgroup with equal number employees whenever it is possible
for groups with 5 employees, get a subgroup number that is 1,1,1,2,2
for groups with 6 employees get a subgroup number that is 1,1,1,2,2,2
many thanks in advance for your help
Data have ; infile datalines ; input name $ group $1. subgroup $1. ; datalines ; Andrew 1 Barbara 1 Boris 1 Bryan 1 Carla 1 Christophe 2 David 2 Emma 2 Eric 2 Fred 2 Henry 2 Hugo 3 Jamal 3 Jeff 3 Kylian 3 Lionel 3 Maelle 3 Martin 4 Michael 4 Nadia 4 Nasser 4 Paul 4 Robert 4 Sacha 5 Salah 5 Samuel 5 Sophie 5 Stefan 5 Zakia 5 ; run ;
This creates a random subgroup:
Data have ;
infile datalines ;
input name $ group $1. ;
datalines ;
Andrew 1
Barbara 1
Boris 1
Bryan 1
Carla 1
Christophe 2
David 2
Emma 2
Eric 2
Fred 2
Henry 2
Hugo 3
Jamal 3
Jeff 3
Kylian 3
Lionel 3
Maelle 3
Martin 4
Michael 4
Nadia 4
Nasser 4
Paul 4
Robert 4
Sacha 5
Salah 5
Samuel 5
Sophie 5
Stefan 5
Zakia 5
;
proc sort data=have;
by group;
run;
proc surveyselect data=have
out=want(rename=(groupid=subgroup))
groups=2;
strata group;
run;
The proc sort is needed if your data is not sorted by the group to allow use of group as the Strata variable.
The Groups= option randomly assigns each record to one of the number indicated for each strata. The rename on the out= option is to change the name of the automatic variable Groupid created by the Groups= option. Subgroup will be numeric in the output. If you need a character variable then use the Want data set as the input to a data step to create the desired character variable.
Suggestion: Group is an option name in multiple SAS procedures. You may want to use a different variable name then group to avoid possible confusion between your variable and the option.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.