Help using Base SAS procedures

Group Obs with equal size based on varaiable and set number of Obs in each group

Accepted Solution Solved
Reply
New Contributor
Posts: 2
Accepted Solution

Group Obs with equal size based on varaiable and set number of Obs in each group

I have 2190 Obs and want to divide this into 5 groups of 438 Obs in each group and have roughly the same amount of SIZE in each group. I know the total for SIZE is 6660.73, so roughly 1332.146 in each group of 438 Obs.

 

I played around with PROC RANK but that didn't seem like the answer. The part I am having trouble with is the 438 obs in each group. I can do some basic IF/THEN and get 5 groups with roughly 1332.146 SIZE in each group but the number of OBS in those 5 groups is obviously not 438.

 

Obs      ID      MOD           SIZE
                                    
  1    DBAS01    3390-3       0.3382
  2    DV3L0B    3390-3       2.8372
  3    DV3L0C    3390-3       2.8232
  4    DV3L0D    3390-3       2.3266
  5    DV3L0E    3390-3       2.7004
  6    DV3L0F    3390-3       1.3652
  7    DV3L0G    3390-3       2.7708
  8    DV3L0H    3390-3       2.8374
  9    DV3L0I    3390-3       2.3264
 10    DV3L0J    3390-3       2.8377

...

...

2185    TSON34    3390-3       0.5344
2186    TSON35    3390-3       0.4412
2187    TSON36    3390-3       0.9780
2188    TSON37    3390-3       1.5679
2189    TSON38    3390-3       0.5232
2190    TSON39    3390-3       0.6562


Accepted Solutions
Solution
Monday
Super User
Posts: 9,681

Re: Group Obs with equal size based on varaiable and set number of Obs in each group

[ Edited ]

It is more like a OR problem . better post it in OR forum.

Or would you like Genetic Algorithm to solve this problem ?

But I think SAS/OR is your best choice .

calling @RobPratt

 

 

Can you attach a CSV file or post more data ?

I like to use GA to solve this problem.

View solution in original post


All Replies
Solution
Monday
Super User
Posts: 9,681

Re: Group Obs with equal size based on varaiable and set number of Obs in each group

[ Edited ]

It is more like a OR problem . better post it in OR forum.

Or would you like Genetic Algorithm to solve this problem ?

But I think SAS/OR is your best choice .

calling @RobPratt

 

 

Can you attach a CSV file or post more data ?

I like to use GA to solve this problem.

SAS Employee
Posts: 433

Re: Group Obs with equal size based on varaiable and set number of Obs in each group

See the PROC OPTMODEL code from this related thread.

Super User
Posts: 9,681

Re: Group Obs with equal size based on varaiable and set number of Obs in each group

Here is the blog written by @Rick_SAS you might be interesting.

 

http://blogs.sas.com/content/iml/2017/05/01/split-data-groups-mean-variance.html

Super User
Posts: 9,681

Re: Group Obs with equal size based on varaiable and set number of Obs in each group

Here is a GA example . Enjoy yourself.

 

data Units;            
set Sashelp.Heart(where=(status = "Alive"));
n+1;
keep n Cholesterol;
run;
 
proc iml;
use Units nobs nobs;
read all var {Cholesterol};
close;
 
start function(x) global(Cholesterol,group,mean_std);
xx=shape(x,group,0,.);
do i=1 to group;
 idx=setdif(xx[i,],{.});
 temp=Cholesterol[idx]; 
 mean_std[i,1]=mean(temp);
end;
sse=range(mean_std);
return (sse);
finish;
 
group=5;  /* <--Change it(divide into 5 groups)*/
mean_std=j(group,1,.);
  
id=gasetup(3,nobs,123456789);
call gasetobj(id,0,"function");
call gasetsel(id,10,1,1);
call gainit(id,1000);
 
niter = 100;
do i = 1 to niter;
 call garegen(id);
 call gagetval(value, id);
end;
call gagetmem(mem, value, id, 1);
 
col_mem=shape(mem,group,0,.);

create group from col_mem;
append from col_mem;
close;
 
print value[l = "Min Value:"] ;
call gaend(id);
quit;

data group;
 set group;
 group+1;
run;
proc transpose data=group out=member(drop=_: index=(col1));
by group;
var col:;
run;
 
data want;
 merge member(rename=(col1=n) where=(n is not missing)) Units(keep=n Cholesterol);
 by n;
run;
proc means data=want mean ;
  class group;
  var Cholesterol;
run;

 

OUTPUT:

Min Value:
0.0495539

The MEANS Procedure

Analysis Variable : Cholesterol
group N Obs Mean
1 644 221.9617225
2 644 221.9792663
3 644 221.9556962
4 644 221.9297125
5 642 221.9727127
New Contributor
Posts: 2

Re: Group Obs with equal size based on varaiable and set number of Obs in each group

Thanks to everyone for the suggestions. Since this is SAS running on z/OS, I was not able to do most of the suggestions. The proc optex looked like to most promising but that procedures was not available on z/OS.

 

I ended up just sorting them by size and assigning a group number from 1 to 5. Repeating that thru all the obs in descending size order. When I sorted them by number and then broke them out into 5 groups, they where roughly the same size.

 

                            |            | Size ALLOC | Size FREE  | # OF Items  
                            |            |------------+------------+------------
                            |   Size CAP |    Sum     |    Sum     |    Sum     
----------------------------+------------+------------+------------+------------
Group1                      |     2136.44|     1347.25|      789.18|      439.00
Group2                      |     2117.11|     1321.39|      795.71|      439.00
Group3                      |     2136.44|     1350.09|      786.35|      439.00
Group4                      |     2161.44|     1372.98|      788.47|      439.00
Group5                      |     2022.22|     1269.02|      753.20|      434.00
All                         |    10573.65|     6660.73|     3912.91|     2190.00
--------------------------------------------------------------------------------

 

Sometimes you just need some suggestions to get you unstuck from your direction and give you some ideas on where to go.

 

Thanks for all the suggestions.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 182 views
  • 6 likes
  • 3 in conversation