hello,
I have a team with 29 employees (it could change) in 5 (it could change) groups .see table HAVE. some groups have 5 employees, and some groups have 6 employees
What I would like to obtain : divide each group into two subgroup with equal number employees whenever it is possible
for groups with 5 employees, get a subgroup number that is 1,1,1,2,2
for groups with 6 employees get a subgroup number that is 1,1,1,2,2,2
many thanks in advance for your help
Data have ; infile datalines ; input name $ group $1. subgroup $1. ; datalines ; Andrew 1 Barbara 1 Boris 1 Bryan 1 Carla 1 Christophe 2 David 2 Emma 2 Eric 2 Fred 2 Henry 2 Hugo 3 Jamal 3 Jeff 3 Kylian 3 Lionel 3 Maelle 3 Martin 4 Michael 4 Nadia 4 Nasser 4 Paul 4 Robert 4 Sacha 5 Salah 5 Samuel 5 Sophie 5 Stefan 5 Zakia 5 ; run ;
@Nasser_DRMCP wrote:
hello,
I have a team with 29 employees (it could change) in 5 (it could change) groups .see table HAVE. some groups have 5 employees, and some groups have 6 employees
What I would like to obtain : divide each group into two subgroup with equal number employees whenever it is possible
for groups with 5 employees, get a subgroup number that is 1,1,1,2,2
for groups with 6 employees get a subgroup number that is 1,1,1,2,2,2
many thanks in advance for your help
Data have ; infile datalines ; input name $ group $1. subgroup $1. ; datalines ; Andrew 1 Barbara 1 Boris 1 Bryan 1 Carla 1 Christophe 2 David 2 Emma 2 Eric 2 Fred 2 Henry 2 Hugo 3 Jamal 3 Jeff 3 Kylian 3 Lionel 3 Maelle 3 Martin 4 Michael 4 Nadia 4 Nasser 4 Paul 4 Robert 4 Sacha 5 Salah 5 Samuel 5 Sophie 5 Stefan 5 Zakia 5 ; run ;
This creates a random subgroup:
Data have ; infile datalines ; input name $ group $1. ; datalines ; Andrew 1 Barbara 1 Boris 1 Bryan 1 Carla 1 Christophe 2 David 2 Emma 2 Eric 2 Fred 2 Henry 2 Hugo 3 Jamal 3 Jeff 3 Kylian 3 Lionel 3 Maelle 3 Martin 4 Michael 4 Nadia 4 Nasser 4 Paul 4 Robert 4 Sacha 5 Salah 5 Samuel 5 Sophie 5 Stefan 5 Zakia 5 ; proc sort data=have; by group; run; proc surveyselect data=have out=want(rename=(groupid=subgroup)) groups=2; strata group; run;
The proc sort is needed if your data is not sorted by the group to allow use of group as the Strata variable.
The Groups= option randomly assigns each record to one of the number indicated for each strata. The rename on the out= option is to change the name of the automatic variable Groupid created by the Groups= option. Subgroup will be numeric in the output. If you need a character variable then use the Want data set as the input to a data step to create the desired character variable.
Suggestion: Group is an option name in multiple SAS procedures. You may want to use a different variable name then group to avoid possible confusion between your variable and the option.
@Nasser_DRMCP wrote:
hello,
I have a team with 29 employees (it could change) in 5 (it could change) groups .see table HAVE. some groups have 5 employees, and some groups have 6 employees
What I would like to obtain : divide each group into two subgroup with equal number employees whenever it is possible
for groups with 5 employees, get a subgroup number that is 1,1,1,2,2
for groups with 6 employees get a subgroup number that is 1,1,1,2,2,2
many thanks in advance for your help
Data have ; infile datalines ; input name $ group $1. subgroup $1. ; datalines ; Andrew 1 Barbara 1 Boris 1 Bryan 1 Carla 1 Christophe 2 David 2 Emma 2 Eric 2 Fred 2 Henry 2 Hugo 3 Jamal 3 Jeff 3 Kylian 3 Lionel 3 Maelle 3 Martin 4 Michael 4 Nadia 4 Nasser 4 Paul 4 Robert 4 Sacha 5 Salah 5 Samuel 5 Sophie 5 Stefan 5 Zakia 5 ; run ;
This creates a random subgroup:
Data have ; infile datalines ; input name $ group $1. ; datalines ; Andrew 1 Barbara 1 Boris 1 Bryan 1 Carla 1 Christophe 2 David 2 Emma 2 Eric 2 Fred 2 Henry 2 Hugo 3 Jamal 3 Jeff 3 Kylian 3 Lionel 3 Maelle 3 Martin 4 Michael 4 Nadia 4 Nasser 4 Paul 4 Robert 4 Sacha 5 Salah 5 Samuel 5 Sophie 5 Stefan 5 Zakia 5 ; proc sort data=have; by group; run; proc surveyselect data=have out=want(rename=(groupid=subgroup)) groups=2; strata group; run;
The proc sort is needed if your data is not sorted by the group to allow use of group as the Strata variable.
The Groups= option randomly assigns each record to one of the number indicated for each strata. The rename on the out= option is to change the name of the automatic variable Groupid created by the Groups= option. Subgroup will be numeric in the output. If you need a character variable then use the Want data set as the input to a data step to create the desired character variable.
Suggestion: Group is an option name in multiple SAS procedures. You may want to use a different variable name then group to avoid possible confusion between your variable and the option.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.