hello folks,
I am trying to assign each observation into 3 groups based on two variables (N size and score) so that each group has similar values in two variables ( e.i. average N size and score per group is similar). I have a small size mock up data shown below ( of course my real data is longer). How should I write a code for group assignment?
data1
id | N size | score |
A | 10 | 131 |
B | 30 | 127 |
C | 4 | 109 |
D | 1 | 110 |
E | 9 | 125 |
F | 51 | 127 |
G | 22 | 119 |
H | 42 | 130 |
I | 7 | 100 |
I took code from @Rick_SAS . But best choice is SAS/OR .
data Units;
infile cards expandtabs;
input id $ size score;
cards;
A 10 131
B 30 127
C 4 109
D 1 110
E 9 125
F 51 127
G 22 119
H 42 130
I 7 100
;
%let NumGroups =3; /* number of treatment groups */
data Treatments;
do Trt = 1 to &NumGroups; /* Trt is variable that assigns patients to groups */
output;
end;
run;
%let Var = size score; /* names of multiple covariates */
proc optex data=Treatments seed=97531 coding=orthcan;
class Trt;
model Trt; /* specify treatment model */
blocks design=Units; /* specify units */
model &Var; /* multiple covariate means will be approx same */
output out=Groups; /* merged data: units assigned to groups */
run;
proc means data=Groups mean std;
class Trt;
var &Var;
run;
PROC FASTCLUS
or
PROC CLUSTER with method = AVERAGE
It does not look like either method brings me similar average values in each variable. Is there any method how to assign each observation into 5 groups so that each group has similar average of v1 and v2?
Not sure exactly what you mean by "similar". Can you show us (a portion) of the clusters that result?
I think it is more like a OR problem.
Post it at OR froum
https://communities.sas.com/t5/Mathematical-Optimization/bd-p/operations_research
and calling out @RobPratt
here is @Rick_SAS blog about it ,maybe could give you a help.
https://blogs.sas.com/content/iml/2017/05/01/split-data-groups-mean-variance.html
Please see my answer to this earlier question for an approach that uses PROC OPTMODEL.
I took code from @Rick_SAS . But best choice is SAS/OR .
data Units;
infile cards expandtabs;
input id $ size score;
cards;
A 10 131
B 30 127
C 4 109
D 1 110
E 9 125
F 51 127
G 22 119
H 42 130
I 7 100
;
%let NumGroups =3; /* number of treatment groups */
data Treatments;
do Trt = 1 to &NumGroups; /* Trt is variable that assigns patients to groups */
output;
end;
run;
%let Var = size score; /* names of multiple covariates */
proc optex data=Treatments seed=97531 coding=orthcan;
class Trt;
model Trt; /* specify treatment model */
blocks design=Units; /* specify units */
model &Var; /* multiple covariate means will be approx same */
output out=Groups; /* merged data: units assigned to groups */
run;
proc means data=Groups mean std;
class Trt;
var &Var;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.