Min Value: |
---|

0.0495539 |

turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- Group Obs with equal size based on varaiable and s...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

a week ago

I have 2190 Obs and want to divide this into 5 groups of 438 Obs in each group and have roughly the same amount of SIZE in each group. I know the total for SIZE is 6660.73, so roughly 1332.146 in each group of 438 Obs.

I played around with PROC RANK but that didn't seem like the answer. The part I am having trouble with is the 438 obs in each group. I can do some basic IF/THEN and get 5 groups with roughly 1332.146 SIZE in each group but the number of OBS in those 5 groups is obviously not 438.

Obs ID MOD SIZE

1 DBAS01 3390-3 0.3382

2 DV3L0B 3390-3 2.8372

3 DV3L0C 3390-3 2.8232

4 DV3L0D 3390-3 2.3266

5 DV3L0E 3390-3 2.7004

6 DV3L0F 3390-3 1.3652

7 DV3L0G 3390-3 2.7708

8 DV3L0H 3390-3 2.8374

9 DV3L0I 3390-3 2.3264

10 DV3L0J 3390-3 2.8377

...

...

2185 TSON34 3390-3 0.5344

2186 TSON35 3390-3 0.4412

2187 TSON36 3390-3 0.9780

2188 TSON37 3390-3 1.5679

2189 TSON38 3390-3 0.5232

2190 TSON39 3390-3 0.6562

Accepted Solutions

Solution

Monday

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

a week ago - last edited a week ago

It is more like a OR problem . better post it in OR forum.

Or would you like Genetic Algorithm to solve this problem ?

But I think SAS/OR is your best choice .

calling @RobPratt

Can you attach a CSV file or post more data ?

I like to use GA to solve this problem.

All Replies

Solution

Monday

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

a week ago - last edited a week ago

It is more like a OR problem . better post it in OR forum.

Or would you like Genetic Algorithm to solve this problem ?

But I think SAS/OR is your best choice .

calling @RobPratt

Can you attach a CSV file or post more data ?

I like to use GA to solve this problem.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

a week ago

See the PROC OPTMODEL code from this related thread.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

a week ago

Here is the blog written by @Rick_SAS you might be interesting.

http://blogs.sas.com/content/iml/2017/05/01/split-data-groups-mean-variance.html

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

a week ago

Here is a GA example . Enjoy yourself.

```
data Units;
set Sashelp.Heart(where=(status = "Alive"));
n+1;
keep n Cholesterol;
run;
proc iml;
use Units nobs nobs;
read all var {Cholesterol};
close;
start function(x) global(Cholesterol,group,mean_std);
xx=shape(x,group,0,.);
do i=1 to group;
idx=setdif(xx[i,],{.});
temp=Cholesterol[idx];
mean_std[i,1]=mean(temp);
end;
sse=range(mean_std);
return (sse);
finish;
group=5; /* <--Change it(divide into 5 groups)*/
mean_std=j(group,1,.);
id=gasetup(3,nobs,123456789);
call gasetobj(id,0,"function");
call gasetsel(id,10,1,1);
call gainit(id,1000);
niter = 100;
do i = 1 to niter;
call garegen(id);
call gagetval(value, id);
end;
call gagetmem(mem, value, id, 1);
col_mem=shape(mem,group,0,.);
create group from col_mem;
append from col_mem;
close;
print value[l = "Min Value:"] ;
call gaend(id);
quit;
data group;
set group;
group+1;
run;
proc transpose data=group out=member(drop=_: index=(col1));
by group;
var col:;
run;
data want;
merge member(rename=(col1=n) where=(n is not missing)) Units(keep=n Cholesterol);
by n;
run;
proc means data=want mean ;
class group;
var Cholesterol;
run;
```

OUTPUT:

Min Value: |
---|

0.0495539 |

The MEANS Procedure

Analysis Variable : Cholesterol | ||
---|---|---|

group | N Obs | Mean |

1 | 644 | 221.9617225 |

2 | 644 | 221.9792663 |

3 | 644 | 221.9556962 |

4 | 644 | 221.9297125 |

5 | 642 | 221.9727127 |

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Monday

Thanks to everyone for the suggestions. Since this is SAS running on z/OS, I was not able to do most of the suggestions. The proc optex looked like to most promising but that procedures was not available on z/OS.

I ended up just sorting them by size and assigning a group number from 1 to 5. Repeating that thru all the obs in descending size order. When I sorted them by number and then broke them out into 5 groups, they where roughly the same size.

| | Size ALLOC | Size FREE | # OF Items

| |------------+------------+------------

| Size CAP | Sum | Sum | Sum

----------------------------+------------+------------+------------+------------

Group1 | 2136.44| 1347.25| 789.18| 439.00

Group2 | 2117.11| 1321.39| 795.71| 439.00

Group3 | 2136.44| 1350.09| 786.35| 439.00

Group4 | 2161.44| 1372.98| 788.47| 439.00

Group5 | 2022.22| 1269.02| 753.20| 434.00

All | 10573.65| 6660.73| 3912.91| 2190.00

--------------------------------------------------------------------------------

Sometimes you just need some suggestions to get you unstuck from your direction and give you some ideas on where to go.

Thanks for all the suggestions.