In a given territory, I have entity, offices that roll up to the entity, and customer IDs for each office (as in table below). Offices in a given entity can share one or more customers, and I want to be able to cycle through different combinations of all offices in a given territory (whether they belong to the same entity or not) and determine a unique count of customers. For example, if I wanted to get the optimal combination of 60% of the offices listed below (3 offices), where optimal is defined as maximizing the number of unique customers, I would have the counts for the following combinations:
A, B, C = 6 unique IDs
A, B, D = 5 unique IDs
A, D, E = 5 unique IDs
...
I need to be able to select a percentage or counts of offices as the territory level and cycle through. Any help is greatly appreciated!!
Territory Entity Office Customer
Atlanta J Inc A 1
Atlanta J Inc A 2
Atlanta J Inc A 3
Atlanta J Inc B 2
Atlanta J Inc B 4
Atlanta J Inc C 2
Atlanta J Inc C 3
Atlanta J Inc C 5
Atlanta J Inc C 8
Atlanta H Inc D 9
Atlanta H Inc E 10
You just need the combination which has three members, right ?
data have;
input Territory $ Entity & $20. Office $ Customer;
cards;
Atlanta J Inc A 1
Atlanta J Inc A 2
Atlanta J Inc A 3
Atlanta J Inc B 2
Atlanta J Inc B 4
Atlanta J Inc C 2
Atlanta J Inc C 3
Atlanta J Inc C 5
Atlanta J Inc C 8
Atlanta H Inc D 9
Atlanta H Inc E 10
;
run;
data want;
if _n_ eq 1 then do;
if 0 then set have;
declare hash ha(dataset:'have',multidata:'y');
ha.definekey('Territory','Office');
ha.definedata('Customer');
ha.definedone();
length list $ 200 key 8;
declare hash c();
c.definekey('key');
c.definedone();
end;
set have;
by Territory;
length list $ 200;
array x{9999} $ 40 _temporary_;
if first.Territory then do;
n=0;call missing(of x{*});c.clear();
end;
if Office not in x then do;n+1;x{n}=Office;end;
if last.Territory then do;
do i=1 to n;
do j=i+1 to n;
do k=j+1 to n;
c.clear();
Office=x{i};
rc=ha.find();
do while(rc=0);
key=Customer;
rx=c.add();
rc=ha.find_next();
end;
Office=x{j};
rc=ha.find();
do while(rc=0);
key=Customer;
rx=c.add();
rc=ha.find_next();
end;
Office=x{k};
rc=ha.find();
do while(rc=0);
key=Customer;
rx=c.add();
rc=ha.find_next();
end;
list=catx(',',x{i},x{j},x{k});count=c.num_items; output;
end;
end;
end;
end;
keep list count;
run;
You just need the combination which has three members, right ?
data have;
input Territory $ Entity & $20. Office $ Customer;
cards;
Atlanta J Inc A 1
Atlanta J Inc A 2
Atlanta J Inc A 3
Atlanta J Inc B 2
Atlanta J Inc B 4
Atlanta J Inc C 2
Atlanta J Inc C 3
Atlanta J Inc C 5
Atlanta J Inc C 8
Atlanta H Inc D 9
Atlanta H Inc E 10
;
run;
data want;
if _n_ eq 1 then do;
if 0 then set have;
declare hash ha(dataset:'have',multidata:'y');
ha.definekey('Territory','Office');
ha.definedata('Customer');
ha.definedone();
length list $ 200 key 8;
declare hash c();
c.definekey('key');
c.definedone();
end;
set have;
by Territory;
length list $ 200;
array x{9999} $ 40 _temporary_;
if first.Territory then do;
n=0;call missing(of x{*});c.clear();
end;
if Office not in x then do;n+1;x{n}=Office;end;
if last.Territory then do;
do i=1 to n;
do j=i+1 to n;
do k=j+1 to n;
c.clear();
Office=x{i};
rc=ha.find();
do while(rc=0);
key=Customer;
rx=c.add();
rc=ha.find_next();
end;
Office=x{j};
rc=ha.find();
do while(rc=0);
key=Customer;
rx=c.add();
rc=ha.find_next();
end;
Office=x{k};
rc=ha.find();
do while(rc=0);
key=Customer;
rx=c.add();
rc=ha.find_next();
end;
list=catx(',',x{i},x{j},x{k});count=c.num_items; output;
end;
end;
end;
end;
keep list count;
run;
This is awesome! Thanks!! Quick question. Is there an easy way to make the number of combinations a variable? Right now, to go from 3 to 4 combinations I have to modify a few parts in the code. Is there a way code it such that I only have to put in the number 3 or 4?
" Is there an easy way to make the number of combinations a variable? "
No, there is not a easy way to go .If you want get the combination of 4 , that is easy to change .
But If you want define a 3 or 4 like a macro variable, I am afraid that I have make a macro .
And If you want all the combination 1 ,2,3,4,........ , I can also make it happen via function GRAYCODE() .
Xia Keshan
I am able to change the combination to 4, 5, ... etc. The problem is I may need to run combinations up to 50 or 200 and it takes a while for me to change the code to do that. That is why I was hoping to get a macro to do it. Any help is greatly appreciated.
Abdou
OK. Let me think about it . Give me some time .
OK. I think I get it all figure out.
data have;
input Territory $ Entity $ Office $ Customer;
cards;
Atlanta JInc A 1
Atlanta JInc A 2
Atlanta JInc A 3
Atlanta JInc B 2
Atlanta JInc B 4
Atlanta JInc C 2
Atlanta JInc C 3
Atlanta JInc C 5
Atlanta JInc C 8
Atlanta HInc D 9
Atlanta HInc E 10
BAtlant JInc A 1
BAtlant JInc A 2
BAtlant JInc A 3
BAtlant JInc B 2
BAtlant JInc B 4
BAtlant JInc C 2
BAtlant JInc C 3
BAtlant JInc C 5
BAtlant JInc C 8
BAtlant HInc D 9
;
run;
proc sql;
select max(n) into : n
from (select count(distinct Office ) as n from have group by Territory );
quit;
%let k=3;
data want;
if _n_ eq 1 then do;
if 0 then set have;
declare hash ha(dataset:'have',multidata:'y');
ha.definekey('Territory','Office');
ha.definedata('Customer');
ha.definedone();
length list $ 2000 key 8;
declare hash c();
c.definekey('key');
c.definedone();
end;
set have;
by Territory;
array x{&n} $ 40 _temporary_;
array y{&n};
if first.Territory then do;
n=0;call missing(of x{*});
end;
if Office not in x then do;n+1;x{n}=Office;end;
if last.Territory then do;
k=-1;
do ii=1 to 2**dim(y);
c.clear();call missing(list);count=0;
rx=graycode(k, of y{*});
if k=&k then do;
do i=1 to n;
if y{i}=1 then count+1;
end;
if count=&k then do;
do i=1 to n;
if y{i}=1 then do;
list=catx(',',list,x{i});
Office=x{i};
rc=ha.find();
do while(rc=0);
key=Customer;
c.replace();
rc=ha.find_next();
end;
end;
end;
unique_IDs=c.num_items; output;
end;
end;
end;
end;
keep Territory list unique_IDs;
run;
Xia Keshan
This appears to work!!! Thanks!!
It is not suitable for the combination which have many members.
If n=10 , You will get 2**10=1024 combination , that is really horrible .
I suggest to make a macro to wrap every kind of combination Like :
%if &k=2 %then %do;
........
%end;
%else %if &k=3 %then %do;
..................
etc .
Xia Keshan
How do you suggest I do it? I have 1,600 offices and want to be able to pick the most optimal 750, or 800, or 850.
When I use the new code you sent me, I get an error saying too many arguments for graycode.
Thanks.
That couldn't be. You gotta know How big number of combination it would generate . You can
NOTuse such algorithm( enumerate all the combinations).
You need some sophisticated algorithm like SIMPLEX(or cutting branch) method .That is beyond my knowledge . and you need SAS/OR for such large scale problem .
Suggest you post it at Mathematical Optimization and Operations Research , maybe some SAS/OR expert could help you.
Xia Keshan
Message was edited by: xia keshan
HaHa, I think I can solve it by Genetic Algorithm. I love GA a hell of a lot. 
Change n=10, if you want get the combination which have 10 members, and I pick up the highest count of unique ids for the combination.
NOTE: This code can only process one Territory one time .
data have;
input Territory $   Entity   $  Office  $   Customer;
cards;
Atlanta        JInc               A                 1
Atlanta        JInc               A                 2
Atlanta        JInc               A                 3
Atlanta        JInc               A                 21
Atlanta        JInc               B                 2
Atlanta        JInc               B                 4
Atlanta        JInc               C                 2
Atlanta        JInc               C                 3
Atlanta        JInc               C                 5
Atlanta        JInc               C                 8
Atlanta        HInc               D                 9
Atlanta        HInc               D                 10
Atlanta        HInc               E                 11
Atlanta        HInc               E                 14
Atlanta        HInc               E                 12
Atlanta        HInc               E                 16
Atlanta        HInc               E                 17
Atlanta        HInc               E                 18
;
run;
proc iml;
use have;
read all var {Office Customer};
close have;
n=4; /*--Change it(the number of combination's members).*/
nrow=nrow(Office );
encoding=j(2,nrow,1);
encoding[1,]=0;
start function(x) global(Office,Customer,n);
 count=ncol(unique(Customer[loc(x)])); 
 if ncol(unique(Office[loc(x)]))^=n then count=0; 
 return (count);
finish;
id=gasetup(2,nrow,1234);
call gasetobj(id,1,"function");
call gasetcro(id,0.95,2);
call gasetmut(id,0.05,3);
call gasetsel(id,100,1,1);
call gainit(id,10000,encoding);
niter = 100;
summary = j(niter,2);
mattrib summary [c = {"Max Count", "Avg Count"} l=""];
do i = 1 to niter;
 call garegen(id);
 call gagetval(value, id);
 summary[i,1] = value[1];
 summary[i,2] = value[:];
end;
call gagetmem(mem, value, id, 1);
Memebers=unique(Office[loc(mem)]);
print "Best Members:" Memebers[l=""],
      "Max Count:   " value[l = ""] ;
call gaend(id);
quit;
Xia Keshan
消息编辑者为:xia keshan
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
