Min Value: |
---|
0.0495539 |
I have 2190 Obs and want to divide this into 5 groups of 438 Obs in each group and have roughly the same amount of SIZE in each group. I know the total for SIZE is 6660.73, so roughly 1332.146 in each group of 438 Obs.
I played around with PROC RANK but that didn't seem like the answer. The part I am having trouble with is the 438 obs in each group. I can do some basic IF/THEN and get 5 groups with roughly 1332.146 SIZE in each group but the number of OBS in those 5 groups is obviously not 438.
Obs ID MOD SIZE
1 DBAS01 3390-3 0.3382
2 DV3L0B 3390-3 2.8372
3 DV3L0C 3390-3 2.8232
4 DV3L0D 3390-3 2.3266
5 DV3L0E 3390-3 2.7004
6 DV3L0F 3390-3 1.3652
7 DV3L0G 3390-3 2.7708
8 DV3L0H 3390-3 2.8374
9 DV3L0I 3390-3 2.3264
10 DV3L0J 3390-3 2.8377
...
...
2185 TSON34 3390-3 0.5344
2186 TSON35 3390-3 0.4412
2187 TSON36 3390-3 0.9780
2188 TSON37 3390-3 1.5679
2189 TSON38 3390-3 0.5232
2190 TSON39 3390-3 0.6562
It is more like a OR problem . better post it in OR forum.
Or would you like Genetic Algorithm to solve this problem ?
But I think SAS/OR is your best choice .
calling @RobPratt
Can you attach a CSV file or post more data ?
I like to use GA to solve this problem.
It is more like a OR problem . better post it in OR forum.
Or would you like Genetic Algorithm to solve this problem ?
But I think SAS/OR is your best choice .
calling @RobPratt
Can you attach a CSV file or post more data ?
I like to use GA to solve this problem.
See the PROC OPTMODEL code from this related thread.
Here is the blog written by @Rick_SAS you might be interesting.
http://blogs.sas.com/content/iml/2017/05/01/split-data-groups-mean-variance.html
Here is a GA example . Enjoy yourself.
data Units;
set Sashelp.Heart(where=(status = "Alive"));
n+1;
keep n Cholesterol;
run;
proc iml;
use Units nobs nobs;
read all var {Cholesterol};
close;
start function(x) global(Cholesterol,group,mean_std);
xx=shape(x,group,0,.);
do i=1 to group;
idx=setdif(xx[i,],{.});
temp=Cholesterol[idx];
mean_std[i,1]=mean(temp);
end;
sse=range(mean_std);
return (sse);
finish;
group=5; /* <--Change it(divide into 5 groups)*/
mean_std=j(group,1,.);
id=gasetup(3,nobs,123456789);
call gasetobj(id,0,"function");
call gasetsel(id,10,1,1);
call gainit(id,1000);
niter = 100;
do i = 1 to niter;
call garegen(id);
call gagetval(value, id);
end;
call gagetmem(mem, value, id, 1);
col_mem=shape(mem,group,0,.);
create group from col_mem;
append from col_mem;
close;
print value[l = "Min Value:"] ;
call gaend(id);
quit;
data group;
set group;
group+1;
run;
proc transpose data=group out=member(drop=_: index=(col1));
by group;
var col:;
run;
data want;
merge member(rename=(col1=n) where=(n is not missing)) Units(keep=n Cholesterol);
by n;
run;
proc means data=want mean ;
class group;
var Cholesterol;
run;
OUTPUT:
Min Value: |
---|
0.0495539 |
The MEANS Procedure
Analysis Variable : Cholesterol | ||
---|---|---|
group | N Obs | Mean |
1 | 644 | 221.9617225 |
2 | 644 | 221.9792663 |
3 | 644 | 221.9556962 |
4 | 644 | 221.9297125 |
5 | 642 | 221.9727127 |
Thanks to everyone for the suggestions. Since this is SAS running on z/OS, I was not able to do most of the suggestions. The proc optex looked like to most promising but that procedures was not available on z/OS.
I ended up just sorting them by size and assigning a group number from 1 to 5. Repeating that thru all the obs in descending size order. When I sorted them by number and then broke them out into 5 groups, they where roughly the same size.
| | Size ALLOC | Size FREE | # OF Items
| |------------+------------+------------
| Size CAP | Sum | Sum | Sum
----------------------------+------------+------------+------------+------------
Group1 | 2136.44| 1347.25| 789.18| 439.00
Group2 | 2117.11| 1321.39| 795.71| 439.00
Group3 | 2136.44| 1350.09| 786.35| 439.00
Group4 | 2161.44| 1372.98| 788.47| 439.00
Group5 | 2022.22| 1269.02| 753.20| 434.00
All | 10573.65| 6660.73| 3912.91| 2190.00
--------------------------------------------------------------------------------
Sometimes you just need some suggestions to get you unstuck from your direction and give you some ideas on where to go.
Thanks for all the suggestions.
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.