turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- BI
- /
- Enterprise Guide
- /
- How to split dataset with condition

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

08-02-2016 02:43 AM - edited 08-02-2016 02:44 AM

Hi all,

I have a dataset, including 1000 clients corresponding 1000 loans differently. I'd like to split 1000 clients to 50 groups with condition is total loans amount and the number of clients of each group equally?

Could you help me solve this calculation by SAS EG?

Cust_ID | Loan_amt ($) |

A | 10 |

B | 20 |

C | 15 |

D | 58 |

Thanks alot,

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

08-02-2016 02:50 AM

```
%let groups = 50;
proc sql noprint;
select sum(loan_amt) into :sum_loan from have;
run;
data want;
set have;
retain group group_amt;
if _n_ = 1
then do;
group = 1;
group_amount = 0;
end;
group_amount + loan_amt;
if group_amount > &sum_loan / &groups
then do;
group + 1;
group_amount = 0;
end;
drop group_amt;
run;
```

---------------------------------------------------------------------------------------------

Maxims of Maximally Efficient SAS Programmers

How to convert datasets to data steps

How to post code

Maxims of Maximally Efficient SAS Programmers

How to convert datasets to data steps

How to post code

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to KurtBremser

08-02-2016 03:27 AM

Thank KurtBremser a lot, but I'd like to have the result which must be satisfy 2 conditions as below:

1/ The total client in each group is equally.

2/ The total loan amount of each group is equally.

As your advice, the result is total loan amount of each group will be less than or equal average loan amount.

1/ The total client in each group is equally.

2/ The total loan amount of each group is equally.

As your advice, the result is total loan amount of each group will be less than or equal average loan amount.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

08-02-2016 03:36 AM

No. You can't do that. You only can say : 1/ The total client in each group is equally. 2/ The total loan amount of each group is ALMOST equally. But that will lead you to a SAS/OR problem. Suggest you post it at OR forum , @Prat is there. Here is my GA code. Maybe you want OR code if you have SAS/OR . proc iml; use sashelp.class; read all var {age} ; close ; start function(x) global(age,group); if ncol(x) ^= ncol(unique(x)) then obj=9999; else do; y=age[x,]; obj=range(shape(y,group,0,0)[,+]); end; return (obj); finish; start switch_mut(s) global(nswitches,nrow); if ncol(s) ^= ncol(unique(s)) then s=ranperm(nrow,1); else do; n = ncol(s); do i = 1 to nswitches; k1 = int(uniform(1234)*n) + 1; k2 = int(uniform(1234)*n) + 1; if k1^=k2 then do; temp=s[k2]; s[k2] = s[k1]; s[k1] = temp; end; end; end; finish; start uniform_cross(child1, child2, parent1, parent2) global(nrow); child1 = parent1; child2 = parent2; do i = 1 to ncol(parent1); r = uniform(1234); if r<=0.5 then do; child1[i] = parent2[i]; child2[i] = parent1[i]; end; end; if ncol(child1) ^= ncol(unique(child1)) then child1=ranperm(nrow,1); else if ncol(child2) ^= ncol(unique(child2)) then child2=ranperm(nrow,1); finish; nrow=nrow(age); group=4; nswitches = 3; encoding=j(2,nrow,1); encoding[2,]=nrow; id=gasetup(2,nrow,1234); call gasetobj(id,0,"function"); call gasetcro(id,0.95,0,"uniform_cross"); call gasetmut(id,0.95,0,"switch_mut"); call gasetsel(id,100,1,1); call gainit(id,1000,encoding); niter = 10000; summary = j(niter,2); mattrib summary [c = {"Min Value", "Avg Value"} l=""]; do i = 1 to niter; call garegen(id); call gagetval(value, id); summary[i,1] = value[1]; summary[i,2] = value[:]; end; call gagetmem(mem, value, id, 1); Memebers=shape(age[mem,],group,0,0); print "Members:" Memebers[l=""], "Group sum:" (Memebers[,+]), "Min Value: " value[l = ""] ; call gaend(id); quit;

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ksharp

08-02-2016 04:23 AM

Thanks Ksharp alot,

Actually, it's difficult to me understand your code.

Actually, it's difficult to me understand your code.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

08-02-2016 03:49 AM - edited 08-02-2016 03:52 AM

Well, I overlooked that.

On second thought, the solution could even be easier:

```
proc sort data=have;
by descending loan_amt;
run;
data want;
set have;
retain group 0 direction 1;
if direction
then do;
group + 1;
if group > &groups
then do;
group = &groups;
direction = 0;
end;
end;
else do;
group + (-1);
if group < 1
then do;
group = 1;
direction = 1;
end;
end;
run;
```

This should give you an almost even distribution.

---------------------------------------------------------------------------------------------

Maxims of Maximally Efficient SAS Programmers

How to convert datasets to data steps

How to post code

Maxims of Maximally Efficient SAS Programmers

How to convert datasets to data steps

How to post code

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to KurtBremser

08-02-2016 04:19 AM

Thank you for your supporting, KurtBremer.

With your way, I see that the number clients of each group will be equally,

However the total loan amount of each group will be not almost equally. You know that, the gap (loan amount) of each loans is perhap a bit or so much, so if we sort by loan amount and use directory, it'll be not optimal.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

08-02-2016 05:54 AM

My code is IML code using Genetic Algorithm , which you need SAS/IML or SAS University Edition to run it. If you have SAS/OR license , Here are some URL you might be interesting in . https://communities.sas.com/t5/Mathematical-Optimization/Minimization-challenge/m-p/259055#U259055 https://communities.sas.com/t5/Mathematical-Optimization/OR-Challenge/m-p/268967#U268967 https://communities.sas.com/t5/Mathematical-Optimization/OR-Challenge-again/m-p/274861#U274861